LINEAR 

N.  V.  EFIMOV 

ALGEBRA 

E.  R.  ROZENDORN 

AND 

MULTI  - 

DIMENSIONAL 

GEOMETRY 

MIR  PUBLISHERS  •  MOSCOW 

N  V.  EFIMOV  UN  EAR 

E.R.ROZENDORN  ALGEBRA 

AND 

MULTI¬ 

DIMENSIONAL 

GEOMETRY 


H.  B.  E<DHMOB 
3.  P.  P03EHA0PH 


J1HHEP1HAH 

AJFEBPA 

H  MHOrOMEPHAH 
TEOMETPMH 


H3AATE^bCTBO  cHAYKA* 


N.  V.  HI- 1  MOV 
E.  R.  ROZENDORN 


LINEAR 

ALGEBRA 

AND 

MULTI¬ 

DIMENSIONAL 

GEOMETRY 


Translated  from  the  Russian 
by 

GEORGE  YANKOVSKY 


MIR  PUBLISHERS-MOSCOW 


First  published  1975 

Revised  from  the  1974  Russian  edition 


Ha  aHiAUiicKOM  nsbiKe 


(C)  II  i;wmvii>CTBo  «IIayKa»,  1974 

(C)  HiikIIhIi  translation,  Mir  Publishers,  1975 


CONTENTS 


I'liifm  . . .  9 

I II 1 1 )  ill  I  li  linn  ,  .  . 11 

Chapter  I.  Linear  Space* 

I  Axioms  nl  n  llnrnr  space . 15 

2.  examples  nl  linear  spaces . 17 

II  I'.lcmculniy  enrol  In rles  to  the  axioms  of  a  linear  space  .  .  23 

4.  I.luear  comlilnntioMs.  Linear  dependence . 25 

5,  l.riiiinn  on  the  basis  minor . 27 

ll.  Itasic  lemma  on  two  systems  of  vectors . 30 

7.  The  rank  of  a  matrix . 32 

8.  Finite-dimensional  and  infinite-dimensional  spaces.  Bases  .  34 

9.  Linear  operations  in  components . 36 

10.  Isomorphism  between  linear  spaces . 38 

11.  Correspondence  between  complex  and  real  spaces  ...  40 

12.  Linear  snbspace . 42 

13.  Linear  hull  .  44 

II.  Sum  of  snbspares.  Direct  sum . 47 

Chapter  II.  Linear  Transformations  of  Variables.  Transformations  of  Co¬ 
ordinates 

1.  Abbievlaled  notation  for  summation . 53 

2.  Linear  transformation  of  variables.  The  product  of  linear 
transformations  of  variables  and  matrix  products  ....  56 

3.  Square  matrices  and  nonsingular  transformations  ....  60 

4.  The  rank  of  a  product  of  matrices . 64 

5.  Transformation  of  coordinates  in  a  change  of  basis  ...  66 

Chapter  III.  Systems  of  Linear  Equations.  Planes  In  Affine  Space 

1.  Affine  space . 70 

2.  Affine  coordinates . 71 

3.  Planes . 73 

4.  Systems  of  first-degree  equations . 77 

5.  Homogeneous  systems . 81 

6.  Nonhomogeneous  systems . 88 

7.  Mutual  positions  of  planes . 91 

8.  Systems  of  linear  inequalities  and  convex  polyhedrons  .  .  98 


6 


CONTENTS 


Chapter  IV.  Linear,  Bilinear  and  Quadratic  Forms 

1.  Linear  forms . 108 

2.  Bilinear  forms . 1)2 

3.  The  matrix  of  a  bilinear  form . 116 

4.  Quadratic  forms . 118 

5.  Reducing  a  quadratic  form  to  canonical  form  by  Lagran¬ 
ge's  method . 121 

6.  The  normal  form  of  a  quadratic  form . 124 

7.  The  law  of  inertia  of  quadratic  forms . 125 

8.  Reducing  a  quadratic  form  to  canonical  form  by  Jacobi’s 

method . 127 

9.  Positive  definite  and  negative  definite  quadratic  forms  ...  129 

10.  Gram’s  determinant.  The  Cauchy-Bunyakovsky  inequality  .  132 

11.  Zero  subspaces  of  a  bilinear  and  a  quadratic  form  ....  134 

12.  The  zero  cone  of  a  quadratic  form . 137 

13.  Elementary  examples  of  zero  cones  of  quadratic  forms  .  .  139 

Chapter  V.  Tensor  Algebra 

1.  Reciprocal  bases.  Contravariant  and  covariant  vectors  .  .142 

2.  Tensor  product  of  linear  spaces . 149 

3.  Basis  in  a  tensor  product.  Components  of  a  tensor  ....  153 

4.  Tensors  of  bilinear  forms . 159 

5.  Multiple-order  tensors.  Tensor  product . 162 

6.  Components  of  multiple-order  tensors . 166 

7.  Multilinear  forms  and  their  tensors  . . 168 

8.  Symmetrization  and  antisymmetrization  (alternation).  Skew- 

symmetric  forms . 170 

9.  An  alternative  description  of  the  tensor  product  of  two  li¬ 
near  spaces . 174 

Chapter  VI.  Groups  and  Some  Applications 

1.  Groups  and  subgroups.  Distribution  of  bases  into  classes 
with  respect  to  a  given  subgroup  of  matrices.  Orientation  .  180 

2.  Transformation  groups.  Isomorphism  and  homomorphism  of 

groups . 186 

3.  Invariants.  Axial  invariants.  Pseudoinvariants . 191 

4.  Tensor  quantities . 197 

5.  The  oriented  volume  of  a  parallelepiped.  The  discriminant 

tensor . 201 

Chapter  VII.  Linear  Transformations  of  Linear  Spaces 

1.  Generalities . 207 

2.  A  linear  transformation  as  a  tensor . 210 

3.  The  geometrical  meaning  of  the  rank  and  determinant  of  a 

linear  transformation.  The  group  of  nonsingular  linear 
transformations . 213 

4.  Invariant  subspaces . 216 

5.  Examples  of  linear  transformations . 218 

(».  Eigenvectors  and  the  characteristic  polynomial  of  a  trans¬ 
formation  . 224 

7.  Basic  theorems  on  the  characteristic  polynomial  and  eigen¬ 
vectors  . 227 

8  Nilpolent  transformations.  The  general  structure  of  singular 
transformations . 229 

9.  The  canonical  basis  of  a  nilpotent  transformation  ....  233 

10.  Reducing  a  transformation  matrix  to  the  Jordan  normal 

form . 242 


CONTENTS 


7 


11.  Transformations  of  a  simple  structure . 248 

12.  Equivalence  of  matrices . 250 

13.  The  Hamilton-Cayley  formula  . 252 

Chapter  VIII.  Spaces  with  Quadratic  Metric 

1.  Scalar  products  . . 254 

2.  The  norm  of  a  vector . 256 

3.  Orthonormal  bases . 258 

4.  Orthogonal  projection.  Orthogonalization . 259 

5.  Metric  isomorphism . 265 

6.  ^-orthogonal  matrices  and  ^  orthogonal  groups  .  .  .  266 

7.  The  group  of  Euclidean  rotations . 270 

8.  The  group  of  hyperbolic  rotations . 278 

9.  Tensor  algebra  in  quadratic-metric  spaces . 287 

10.  The  equation  of  a  hyperplane  in  quadratic-metric  space  .  .  295 

11.  Euclidean  space.  Orthogonal  matrices.  Orthogonal  group  .  .  297 

12.  The  normal  equation  of  a  hyperplane  in  Euclidean  space  302 

13.  The  volume  of  a  parallelepiped  in  Euclidean  space.  The  dis¬ 
criminant  tensor.  Vector  product . 304 

Chapter  -  IX.  Linear  Transformations  of  Euclidean  Space 

1.  Adjoint  of  a  transformation  ,  . . 308 

2.  Lemma  on  the  characteristic  roots  of  a  symmetric  matrix  .  .  310 

3.  Self-adjoint  transformations . 311 

4.  Reducing  a  quadratic  form  to  canonical  form  in  an  ortho¬ 
normal  basis . 317 

5.  The  joint  reduction  to  canonical  form  of  two  quadratic 

forms . 319 

6.  Skew-adjoint  transformations . 322 

7.  Isometric  transformations . 325 

8.  The  canonical  form  of  an  isometric  transformation  ....  330 

9.  The  motion  of  a  rigid  body  with  one  fixed  point . 335 

10.  The  curvature  and  torsion  of  a  space  curve . 338 

11.  The  decomposition  of  an  arbitrary  linear  transfoimation  into 

the  product  of  a  self-adjoint  and  an  isometric  transforma¬ 
tion  . 340 

12.  Applications  to  the  theory  of  elasticity.  The  strain  tensor  and 

the  stress  tensor . 343 

Chapter  X.  Multivectors  and  Outer  Forms 

1.  Alternation . 346 

2.  Multivectors.  Outer  product . 351 

3.  Bivectors . 357 

4.  Simple  multivectors . 366 

5.  Vector  product . 370 

6.  Outer  forms  and  operations  on  them . 376 

7.  Outer  forms  and  covariant  multivectors . 379 

8.  Outer  forms  in  three-dimensional  Euclidean  space  ....  386 

Chapter  XI.  Quadric  Hypersurfaces 

1.  The  general  equation  of  a  quadric  hypersurface . 391 

2.  Changes  in  the  left  member  of  the  equation  under  transla¬ 
tion  of  the  origin . 392 

3.  Changes  in  the  left  member  of  the  equation  for  a  change  in 

the  orthonormal  basis . 395 

4.  The  centre  of  a  quadric  hypersurface . 397 

5.  Reducing  to  canonical  form  the  general  equation  of  a  quad¬ 
ric  hypersurface  in  Euclidean  space . 399 


8 


CONTENTS 


6.  Classification  of  quadric  hypersurfaces  in  Euclidean  space  402 


7.  Affine  transformations .  410 

8.  Affine  classification  of  quadric  hypersurfaces . 414 

9.  The  intersection  of  a  straight  line  with  a  quadric  hypersur¬ 
face.  Asymptotic  directions . 415 

10.  Conjugate  directions . 418 

Chapter  XII.  Projective  Space 

1.  Homogeneous  coordinates  in  affine  space.  Points  at  infinity  422 

2.  The  concept  of  a  projective  space . 425 

3.  A  bundle  of  planes  in  affine  space . 435 

4.  Central  projection . 443 

5.  Projective  equivalence  of  figures . 446 

6.  Projective  classification  of  quadric  hypersurfaces . 453 

7.  The  intersection  of  a  quadric  hypersurface  and  a  straight 

line.  Polars . 459 

Appendix  I.  Proof  of  the  theorem  on  the  classification  of  linear  quantities  467 

Appendix  2.  Hermitian  forms.  Unitary  space . 471 

Bibliography . 484 

Index . 486 


PREFACE 


This  book  was  conceived  as  a  text  combining  the  course  of 
linear  algebra  and  analytic  geometry.  It  originated  as  a  course 
of  lectures  delivered  by  N.  V.  Efimov  at  Moscow  State  University 
(mechanics  and  mathematics  department)  in  1964-1966.  However, 
the  material  of  these  lectures  has  been  completely  reworked  and 
substantially  expanded.  We  have  tried  to  bear  in  mind  the  requi¬ 
rements  of  other  mathematical  disciplines  and  also  of  mechanics 
and  physics.  We  hope  that  all  parts  of  the  text  will  be  useful.  The 
only  preparation  required  for  this  text  can  be  given  an  a  first- 
semester  course  of  analytic  geometry  and  algebra  at  the  most  ele¬ 
mentary  level.  All  that  is  needed  is  a  firm  grasp  of  the  elements 
of  these  subjects.  For  Chapter  XII  the  student  should  be  ac¬ 
quainted  with  projective  transformations  and  the  projective  pro¬ 
perties  of  figures  in  the  plane.  Also,  in  Chapter  X  the  reader  may 
simplify  his  task  by  skipping  Subsections  13  to  23  (Section  3)  and 
Subsection  10  of  Section  7.  What  is  left  of  Chapter  X  can  serve  as 
a  minimal  algebraic  basis  for  the  theory  of  multidimensional  in¬ 
tegration. 

It  may  be  noted  in  conclusion  that  the  first  five  chapters  already 
contain  material  with  broad  applications  in  mathematics,  mecha¬ 
nics,  and  physics.  These  chapters,  supplemented  with  some  of  the 
material  of  subsequent  chapters,  can  be  utilized  in  higher  tech¬ 
nical  schools  with  a  more  advanced  mathematics  curriculum. 


N.  V.  Efimov 
E  R  Rozendorn 


INTRODUCTION 


In  mathematics  and  its  applications,  one  often  has  to  deal  with 
certain  sets  of  objects  for  which  so-called  linear  operations  have 
been  defined:  addition  and  multiplication  by  a  number  (scalar). 
For  example,  in  mechanics  we  consider  all  kinds  of  forces  applied 
to  a  given  rigid  body.  Two  forces  applied  to  a  single  point  may 
be  added,  that  is  to  say,  replaced  by  a  single  force  applied  to  that 
point.  The  force  may  be  multiplied  by  a  scalar  a,  which  means 
“increased  a  times”  in  the  direction  of  action.  In  mechanics  we 
also  consider  the  composition  of  velocities  and  the  multiplication 
of  a  velocity  by  a  scalar,  and  the  composition  of  accelerations  and 
the  multiplication  of  an  acceleration  by  a  scalar.  Forces,  velocities 
and  accelerations  differ  as  to  their  physical  nature,  but  the  linear 
operations  performed  on  them  are,  from  the  geometrical  point  of 
view,  of  a  single  nature.  It  is  for  this  reason  that  in  mechanics 
we  have  a  general  unified  mode  of  depicting  these  entities  in  the 
form  of  directed  line  segments.  In  this  way,  they  are  all  handled 
by  the  general  rules  of  addition  and  scalar  multiplication  of  geo¬ 
metrical  vectors. 

However,  this  generalization  goes  much  farther.  Consider,  for 
example,  the  set  of  all  functions  continuous  on  the  real  number 
line,  or  the  set  of  all  periodic  functions  with  a  given  period,  or  the 
set  of  all  algebraic  polynomials.  It  is  quite  natural  in  each  of 
these  sets  to  consider  linear  operations  (understanding  the  sum  of 
functions  and  the  product  of  a  function  by  a  number  as  is  usual 
in  analysis).  The  objects  we  are  now  speaking  of  are  not  like  for¬ 
ces,  velocities  or  accelerations,  or  geometrical  vectors.  Too,  the 
linear  operations  performed  on  them  differ  from  the  linear  opera¬ 
tions  performed  on  the  vector  quantities  of  mechanics  or  on  geo¬ 
metrical  vectors. 

However,  there  is  something  common  to  them  all  that  permits 
studying  linear  operations  abstractly,  quite  apart  from  the  specific 
nature  of  the  entities. 


INTRODUCTION 


1-2 


busily,  in  ;my  one  of  our  examples  the  linear  operations  carried 
out  on  I  lie  elements  of  a  given  set  (that  is,  on  the  objects  that 
make  up  the  set)  yield  elements  of  the  same  set.  Namely,  by 
adding  geometrical  vectors  or  multiplying  them  by  a  scalar,  we 
obtain  geometrical  vectors;  by  adding  continuous  functions  or 
multiplying  them  by  a  scalar,  we  get  continuous  functions.  The 
same  goes  for  periodic  functions  with  a  given  period  and  for  al¬ 
gebraic  polynomials. 

What  is  more,  linear  operations  that  differ  for  different  sets 
have  certain  common  properties  (which  will  be  examined  in  the 
first  chapter).  The  existence  of  common  properties  permits  us  to 
study  linear  operations  as  such. 

The  study  of  sets  with  specified  linear  operations  leads  to  the 
concept  of  a  linear  space.  The  theory  of  linear  spaces  finds  very 
broad  applications  in  modern  mathematics  and  allied  sciences. 

A  linear  space  will  be  defined  in  the  first  section.  It  will  not 
contain  any  description  of  the  elements  of  the  sets  considered  or 
of  the  linear  operations  performed.  The  only  thing  required  will 
be  certain  properties  of  linear  operations  that  are  common  to  all 
particular  cases.  These  requirements  are  expressed  as  the  axioms 
of  a  linear  space.  It  is  worth  mentioning  that  the  requirements 
expressed  in  the  axioms  are  very  few  and  there  remains  the  pos¬ 
sibility  of  adding  new  assumptions  to  them.  Therefore,  a  certain 
classification  appears  in  the  general  concept  of  a  linear  space  so 
that,  actually,  we  have  to  do  not  with  a  single  linear  space  but 
with  distinct  classes  of  linear  spaces,  and  the  theory  based  on 
the  axioms  of  a  linear  space  becomes  diversified. 

All  linear  spaces  may  be  separated  into  finite-dimensional  and 
infinite-dimensional  spaces.  Finite-dimensional  spaces  (one-dimen¬ 
sional,  two-dimensional,  three-dimensional,  and  so  forth)  are  stu¬ 
died  in  linear  algebra,  which  makes  up  the  subject  matter  of  this 
text.  Infinite-dimensional  spaces  are  considered  in  various  parts  of 
functional  analysis.  We  will  speak  of  them  only  occasionally  to 
illustrate*  certain  general  conclusions. 

An  instance  of  a  finite-dimensional  space  is  the  three-dimen¬ 
sional  space  of  geometrical  (free)  vectors.  This  space  contains 
within  it  an  infinite  number  of  two-dimensional  and  one-dimen¬ 
sional  spaces  called  subspaces  (every  two-dimensional  subspace 
consists  of  vectors  lying  in  one  plane,  and  every  one-dimensional 
subspace  consists  of  vectors  lying  on  a  single  straight  line).  Thus, 
for  one-,  two-  and  three-dimensional  linear  spaces  we  have  geo¬ 
metrical  models  that  are  naturally  associated  with  our  pictorial 
conceptions  of  vectors.  When  passing  to  multidimensional  spaces, 
the  pictorial  nature  of  the  entities  is  partially  lost,  but  the  theory 
of  these  spaces  retains  its  geometrical  character.  The  point  is  that 
its  basic  concepts  are  constructed  by  borrowing  from  the  three- 


INTRODUCTION 


13 


dimensional  case  and  appropriately  generalizing  to  the  multidi¬ 
mensional  case.  The  retention  of  geometrical  terminology  also 
plays  a  part.  For  instance,  when  speaking  of  diverse  sets,  we  call 
them  spaces.  Note  too  that  the  elements  of  any  linear  space  are 
conventionally  called  vectors.  And  so  linear  spaces  are  also 
termed  vector  spaces.  The  geometrical  nature  of  the  terminology 
and  of  the  basic  concepts  of  linear  algebra  helps  to  make  contact 
with  geometry.  We  have  in  view  here  analytic  geometry  and,  par¬ 
ticularly,  multidimensional  analytic  geometry,  that  is,  the  multi¬ 
dimensional  analogue  of  ordinary  (three-dimensional)  analytic 
geometry.  What  is  more,  linear  algebra  and  analytic  geometry  are 
so  closely  connected  that  it  is  difficult  to  draw  any  hard  and  fast 
line  between  them.  And  we  will  not  try  to  do  that.  We  have  al¬ 
ready  stated  that  the  subject  matter  of  this  text  is  linear  algebra. 
With  the  same  justification  we  can  say  that  its  subject  is  multidi¬ 
mensional  analytic  geometry. 


Chapter  I 


LINEAR  SPACES 


§  1.  Axioms  of  a  linear  space 

1.  Suppose  we  have  a  set  L  consisting  of  any  kind  of  elements. 

We  denote  these  elements  by  the  lower-case  Latin  letters  a,  b . 

x,y,  ...  .  However,  in  only  one  case  we  will  use  the  lower-case 
Greek  letter  0  in  a  similar  instance.  Together  with  the  elements 
of  the  set  L  we  will  consider  any  numbers,  real  or  complex,  which 
will  be  denoted  by  lower-case  Greek  letters  a,  p,  ...  (with  the 
exception  of  0). 

2.  We  assume  that  the  concept  of  equality  of  elements  has  been 
defined  in  the  set  L.  This  means  that  all  elements  of  L  have  been 
distributed  in  some  way  into  classes  (subsets  of  L)  so  that  distinct 
classes  do  not  have  any  elements  in  common.  Then  two  elements, 
a  and  b,  are  taken  to  be  equal  (a  —  b)  if  they  belong  to  some  one 
class.  Every  class  can  also  consist  of  a  single  element,  in  which 
case  the  equality  a  —  b  means  that  a  and  b  denote  the  same  ele¬ 
ment  of  L. 

Later  on  we  will  sometimes  have  to  do  with  what  we  call  an 
admissible  replacement  of  one  element  of  L  with  another  element, 
if  the  replacement  is  made  within  one  class;  that  is  to  say,  one 
element  is  replaced  by  any  equal  element. 

3.  In  some  cases,  instead  of  a  specified  partition  of  L  into 
classes  of  equal  elements,  certain  conditions  of  admissible  repla¬ 
cements  will  be  indicated  (that  is,  conditions  under  which  the  ele¬ 
ments  are  taken  to  be  equal).  Then,  for  an  arbitrary  element  a 
in  L  there  will  be  defined  a  class  sd*  consisting  of  all  elements  of  L 
equal  to  the  element  a.  However,  in  order  to  obtain  the  required 
partition  of  L  into  such  classes,  three  circumstances  must  be  en¬ 
sured. 

(1)  The  element  a  itself  must  belong  to  the  class  s4-y  that  is, 
the  conditions  of  equality  must  be  such  that  the  element  a  is 


Ill 


MNEAR  SPACES 


[CH.  I 


considered  equal  to  itself:  a  —  a  (in  other  words,  the  replacement 
of  an  element  by  itself  must  be  considered  admissible). 

(2)  If  a  =  b,  then  it  must  be  true  that  b  —  a. 

(3)  If  a  =  b  and  b  =  c,  then  it  must  be  true  that  a  —  c. 

If  and  only  if  these  three  circumstances  hold  can  any  two  ele¬ 
ments  belonging  to  the  class  si  be  equal.  Besides,  the  class  si 
includes  all  elements  of  L  that  are  equal  to  some  one  element  of 
that  class. 

The  foregoing  is  illustrated  by  the  examples  of  Section  2. 

4.  We  will  say  that  in  the  set  L  are  defined  the  operations  of 
addition  and  multiplication  by  a  scalar  if: 

(1)  to  every  two  elements  a,  b  of  L  there  is  associated  a  certain 
element  of  L  called  their  sum  denoted  by  a  -j-  b\ 

(2)  to  every  scalar  a  and  every  element  a  in  L  there  is  asso¬ 
ciated  a  certain  element  of  L  called  the  product  of  a  by  a  or  a 
by  a;  this  product  is  denoted  by  a  a  or  aa. 

It  is  assumed  that  the  operations  of  addition  and  scalar  multi¬ 
plication  are  invariant  with  respect  to  admissible  replacements  of. 
the  elements  of  the  set  L:  if  a  =  a',  b  =  b',  then  a  -f-  b  =  a'  +  b' 
and  aa  —  aa'. 

Also,  the  following  eight  axioms  are  assumed  to  hold  true: 

(1)  for  any  a,  b  in  L, 

a  -\-  b  =  b  -f-  a 

This  is  the  commutative  property  for  addition; 

(2)  for  any  a,  b,  c  in  L, 

(a  +  b)  +  c  =  a  +  (b  +  c) 

This  is  the  associative  property.  It  permits  writing  a  sum  without 
recourse  to  brackets:  a-\-b  +  c—(a-t-b)+c  —  a+(b-\-c). 
Also,  by  the  first  axiom,  the  order  of  the  terms  is  immaterial; 

(3)  the  set  L  has  an  element  0  such  that 

fl  +  0  =  n 

for  any  a  in  L.  0  is  called  the  zero  element; 

(4)  for  every  element  x  in  L  there  is  an  element  y  in  L  such 


lilemenl  //  is  called  the  inverse  of  x  and  is  denoted  — x\ 

(5)  1  ■  a  =  «; 

(f>)  a(p.i)  =  (uP)«; 

(7)  («  -j-  p)  a  =  a  a  +  Pa; 

(8)  a  (it  |-  b)  -  -  aa  -j-  a b. 

In  Hie  last  four  axioms,  a  and  b  denote  arbitrary  elements  of  L; 
a  and  |t  are  arbitrary  scalars. 


$  si  exampi.es  or  linear  spaces  17 

Note  that  the  property  expressed  by  the  seventh  axiom  is  called 
the  distributive  property  for  a  factor  taken  from  L  (this  axiom 
permits  distributing  a  factor  from  L  over  the  components  of  a  nu¬ 
merical  factor).  The  eighth  axiom  expresses  the  distributive  pro¬ 
perty  for  a  numerical  factor. 

5.  Basic  definition.  The  set  L,  together  with  the  operations  spe¬ 
cified  in  it  of  addition  and  multiplication  by  a  scalar,  is  called  a 
linear  space. 

We  stress  the  fact  that  it  is  assumed  in  this  definition  that  addi¬ 
tion  and  multiplication  satisfy  all  the  properties  enumerated  in 
Subsection  4. 

The  eight  axioms  of  Subsection  4  are  called  the  axioms  of  a  li¬ 
near  space. 

As  has  already  been  mentioned  in  the  introduction,  the  elements 
of  a  linear  space  are  also  called  vectors,  and  so  a  linear  space  is 
likewise  known  as  a  vector  space.  Very  often  we  will  call  the 
set  L  a  space  without  using  any  modifying  adjective,  and  it  will 
be  assumed  to  be  a  vector  space. 

6.  If  we  have  a  space  L  and  multiplication  of  vectors  solely  by 
real  numbers  is  defined,  then  L  is  termed  a  real  vector  space.  If 
multiplication  of  the  vectors  of  L  is  also  defined  for  complex  num¬ 
bers,  then  the  space  L  is  termed  a  complex  vector  space. 

In  the  future,  the  term  “arbitrary  scalar”  will  mean  any  real 
number  if  we  are  speaking  of  a  real  space  and  any  complex 
number  if  we  are  dealing  with  a  complex  space. 

A  substantial  portion  of  the  facts  stated  in  the  first  few  chapters 
of  this  book  refer  both  to  real  and  complex  spaces.  If  a  certain 
property  holds  true  only  for  a  real  or  only  for  a  complex  space, 
that  will  be  specially  pointed  out. 

7.  Occasionally,  instead  of  multiplication  by  a  real  or  a  complex 
number  we  consider  multiplication  of  the  elements  of  L  by  ele¬ 
ments  of  an  arbitrary  algebraic  field  U  (all  the  eight  axioms  of 
a  linear  space  must  hold  true,  of  course).  In  this  case,  the  set 
together  with  the  specified  linear  operations  is  termed  a  linear  (or 
vector)  space  over  the  field  U. 

§  2.  Examples  of  linear  spaces 

Preliminary  remark.  If,  relative  to  any  specific  set  equipped 
with  linear  operations,  it  is  asserted  that  the  set  is  a  linear  space, 
then  to  prove  that  assertion  it  is  necessary  to  verify  that  the  spe¬ 
cified  operations  are  indeed  linear,  that  is  to  say,  that  they  satisfy 
the  requirements  of  the  eight  axioms  of  a  linear  space. 


18 


I. INEAR  SPACES 


[CM.  I 


1.  The  space  of  geometric  vectors.  Consider  the  set  of  all  geo¬ 
metric  vectors  in  three-dimensional  Euclidean  space.  Note  that 
Iwo  dements  of  this  set,  that  is,  two  vectors,  are  considered  equal 
if  and  only  if  they  are  collinear,  have  equal  lengths  and  are  in 
the  same  direction.  We  thus  have  in  mind  free  vectors  whose  point 
of  application  may  be  chosen  arbitrarily. 

Admissible  replacements  of  a  vector  consist  in  parallel  transla¬ 
tions  to  new  points  of  application.  Clearly,  the  three  conditions  of 
Subsection  3  of  Section  1  hold  true.  Linear  operations  on  geometric 
vectors  are  carried  out  in  the  familiar  way:  addition  by  means  of 
the  parallelogram  rule,  multiplication  by  a  real  number  a  repre¬ 
sents  an  a-fold  stretching  of  the  vector.  Both  operations  are  in¬ 
variant  under  admissible  replacements.  Indeed,  if  a  —  a',  b  =  b', 
then  the  parallelogram  constructed  on  the  vectors  a',  b'  is  obtained 
by  a  parallel  translation  of  the  parallelogram  constructed  on  a,  b. 
Thus,  the  vector  a'  -f  b'  is  obtained  by  a  parallel  translation  of 
the  vector  a  -j-  b,  that  is,  a  +  b  =  a'  +  b'.  It  is  quite  apparent  that 
the  equation  a  a  =  a  a'  also  holds  true. 

Geometric  vectors  with  the  indicated  definition  of  linear  opera¬ 
tions  form  a  real  linear  space.  The  zero  element  here  is  a  vector 
of  zero  length.  If  x  is  any  vector,  then  the  inverse  is  y  —  — x, 
which  is  a  vector  of  the  same  length  but  in  the  opposite  direction. 
The  requirements  of  the  axioms  ( 1 ) -  (8) ,  Subsection  4,  Section  1, 
hold  true.  This  is  evident  from  simple  geometrical  reasoning  and 
is  of  course  in  no  way  accidental.  The  point  is  that  geometric 
vectors  served  as  the  original  model  for  the  general  concept  of  a 
linear  space.  In  other  words,  the  axioms  ( 1 ) - (8)  express  certain 
properties  of  linear  operations  on  geometric  vectors  that  are  quite 
familiar  from  elementary  vector  algebra. 

One  might  readily  ask  why,  in  axioms  ( 1 ) - (8) ,  there  are  not 
included  certain  simple  and  important  properties  of  geometric 
vectors  that  are  constantly  utilized  in  vector  computations.  For 
instance,  that  the  multiplication  of  a  vector  by  the  scalar  zero 
yields  a  zero  vector  or  that  in  the  multiplication  of  any  vector  x 
by  the  scalar  — 1  we  get  the  opposite  vector  — x.  It  turns  out  that 
this  is  not  necessary  since  such  properties  may  be  proved,  that  is, 
derived  from  the  axioms,  which  is  what  will  be  done  in  Section  3. 

2.  Zero  space.  Let  L  be  a  set  consisting  of  only  one  element. 
What  that  element  is  is  immaterial.  Let  us  denote  it  by  the  letter  0. 
We  now  define  linear  operations  in  L,  assuming  that  0  added  to 
itself  yields  0  and  that  when  it  is  multiplied  by  a  real  number  it 
also  yields  0.  It  is  easy  to  see  that  the  axioms  ( 1 ) - (8)  hold  true. 
Hence,  the  given  set  L  is  a  real  linear  space  consisting  obviously 
of  the  sole  zero  element.  It  is  just  as  easy  to  define  the  set  L  as  a 
complex  linear  space. 


EXAMPLES  OF  LINEAR  SPACES 


19 


S  2] 

Remark.  All  other  (real  or  complex)  linear  spaces  of  necessity 
have  an  infinitude  of  elements,  for  in  Subsection  2  of  Section  3  it 
is  shown  that  if  a  linear  space  contains  at  least  one  element  a 
different  from  the  zero  element,  then  for  distinct  scalars  a  and  p 
the  elements  a  a  and  pa  are  also  distinct. 

3.  Coordinate  space.  Now  let  L  denote  a  set  whose  elements 
consist  of  all  possible  ordered  n-tuples  of  real  numbers  (n  a  fixed 
natural  number).  An  ordered  n-tuple  is  one  in  which  the  consti¬ 
tuents  have  been  numbered.  (They  need  not  necessarily  be 
distinct.)  When  we  say  that  an  element  x  in  L  is  an  n-tuple  of 
numbers  Xi,x2,  ....  xn,  we  will  write  x  =  {x\,x2,  •••,  *„}.  Assum¬ 
ing  x  to  be  arbitrary,  let  us  consider  another  (also  arbitrary)  ele¬ 
ment  y  —  {y\,  y2,  yn}-  We  will  assume  that  the  elements  x 
and  y  are  equal  if  and  only  if  xt  =  yu  x2  =  y2,  •  •  • ,  xn  =  yn.  We 
define  linear  operations  in  L  by  the  relations 

x  +  y  =  {x{  +  yh  x2  +  y2 . xn  +  yn),  (1) 

ax  =  {cLC|,  ax2,  . . . ,  axn)  (2) 

Then  the  requirements  of  the  first  two  axioms  of  a  linear  space 
hold  true  since  the  addition  of  real  numbers  is  commutative  and 
associative.  To  verify  axioms  three  and  four,  it  suffices  to  indicate 
a  zero  element  in  L,  namely, 

0  =  {0,  0,  ...,  0}  (3) 

It  is  also  clear  that  for  any  x  in  L  there  is  an  inverse  element  — x, 
namely, 

—  *  =  {—  xu  —  x2,  . . .,  —  xn)  (4) 

Axiom  (5)  is  immediately  apparent  from  relation  (2).  Finally, 
axioms  (6),  (7),  and  (8)  hold  true  because  of  (1),  (2)  and  also 
due  to  the  fact  that  the  multiplication  of  real  numbers  is  associa¬ 
tive  and  distributive. 

To  summarize,  then,  the  set  L  with  specified  linear  operations  is 
a  real  linear  space.  We  will  call  it  a  real  coordinate  space  Kn- 

Remark.  The  present  set  L  under  consideration  does  not  permit 
us  to  regard  the  factor  a  in  (2)  as  a  complex  number  because  a 
complex  a  in  the  right  member  of  (2)  would  yield  a  set  of  complex 
numbers  that  is  not  an  element  of  L. 

4.  This  time  let  us  denote  by  L  the  set  of  all  ordered  n-tuples  of 
complex  numbers. 

We  define  the  linear  operations  by  (1)  and  (2),  now  assuming 
that  a,  Xj,  yj  (/  =  1,  ...,  n)  are  complex  numbers.  As  in  the 
preceding  subsection,  all  eight  axioms  ( 1 ) - (8)  hold  true,  and  the 


20 


LINEAR  SPACES 


[CH.  ! 


zero  and  inverse  elements  are  expressed  by  the  formulas  (3)  and 
(4).  Thus,  L  is  a  linear  space;  it  is  a  complex  linear  space  since 
the  scalars  a  are  complex  numbers.  We  will  call  it  a  complex  coor¬ 
dinate  space  Kn- 

Remark.  However,  there  is  nothing  to  stop  us,  in  (1)  and  (2), 
from  using  only  real  numbers  for  a  while  Xj,  ijj  remain  complex. 
Then  the  set  L  will  be  a  real  linear  space.  It  is  clear  from  this  that 
the  same  objects  (for  instance,  ordered  sets  of  complex  numbers) 
can  serve  as  vectors  of  distinct  linear  spaces.  For  this  reason,  in 
the  general  definition  of  Section  1,  a  linear  space  is  not  merely  a 
set  L  but  a  set  together  with  linear  operations  specified  in  it;  and 
it  is  also  necessary  to  indicate  the  field  from  which  the  factors  a 
are  taken. 

5.  The  space  of  matrices.  According  to  accepted  practice,  we 
will  say  that  a  rectangular  matrix,  more  precisely,  an  m  X n 
matrix,  is  an  array  of  numbers  arranged  in  m  rows  with  n  num¬ 
bers  in  each  row.  If  the  numbers  making  up  the  matrix  are  de¬ 
noted  by  aik  (t  =  1,  2 . m;  k  —  1,  2,  . . . ,  n)  and  the  matrix 

itself  by  a,  then  we  write 


«n 

al  2  • 

•  a\n 

a  = 

a2l 

fl22  • 

•  a2  n 

am2  • 

•  & mi 

In  this  notation,  the  given  numbers  are  also  arranged  in  columns 
(the  number  an,  lies  in  the  ith  row  and  the  6th  column).  Besides 
this  expanded  notation  we  will  also  make  use  of  an  abbreviated 
notation: 

a  =  \\atk  || 

Let  us  agree  to  call  a  matrix  a  real  matrix  when  it  is  composed  of 
real  numbers  and  a  complex  matrix  when  the  elements  (entries) 
arc  complex  numbers. 

Let  L  be  the  set  of  all  mXn  matrices,  for  example,  real 
matrices.  Two  matrices  will  be  considered  equal  elements  of  the 
set  L  if  and  only  if  corresponding  positions  in  the  matrices  are 
occupied  by  the  same  numbers  (that  is  to  say,  one  and  the  same 
number  lying  in  both  matrices  at  the  intersection  of  the  ith  row 
and  6th  column).  Let  us  equip  the  set  L  with  linear  operations, 
namely,  if  a  =  ||ii1/,||,  b  =  ||6ift||  are  arbitrary  matrices  in  L  and  a 
is  an  arbitrary  real  number,  then  we  set 

a  +  b  =  ||  alk  +  bik  ||,  aa  =  ||aa/ft|| 


(5) 


$  2)  EXAMPLES  OF  LINEAR  SPACES  21 

In  other  words,  when  adding  the  m  X  n  matrices  a  and  b,  we  add 
pairwise  the  identically  located  numbers  a,/,  and  bik.  To  multiply 
matrix  a  by  a  scalar  a,  we  multiply  by  a  all  the  numbers  that 
make  up  matrix  a.  Just  as  in  Subsection  3,  we  can  establish  that 
the  linear  operations  (5)  satisfy  the  axioms  (l)-(8).  Here  the  role 
of  the  zero  element  in  L  is  played  by  the  matrix  0,  which  consists 
entirely  of  zeros  (the  zero  matrix).  For  the  inverse  element  we 
have  the  matrix  || — 0,7,11,  which  is  the  inverse  of  a  —  ||a,7,||.  Thus, 
L  together  with  the  linear  operations  (5)  is  a  real  linear  space. 
Similarly,  the  set  of  all  complex  m  X  n  matrices  with  linear  ope¬ 
rations  (5),  where  a  is  a  complex  number,  constitutes  a  complex 
linear  space.  Naturally,  when  we  consider  complex  m  X  n  mat¬ 
rices,  we  can  regard  a  to  be  real.  Then  we  get  a  real  linear  space 
of  the  same  complex  matrices. 

Remark.  In  the  particular  case  of  m  =  1  (for  a  given  n),  we 
obtain  matrices  each  of  which  has  only  one  row  (consisting  of  n 
numbers).  The  linear  space  of  such  matrices  is  nothing  other  than 
the  coordinate  space  Kn  (see  Subsection  3).  For  n  =  1  and  given 
m  we  get  matrices  with  only  one  column.  Clearly,  they  too  repre¬ 
sent  a  coordinate  space,  namely  Km.  What  is  more,  the  space  of 
arbitrary  m  X  n  matrices  may  be  regarded  as  a  coordinate  space 
Kmn ,  since  we  can  readily  establish  for  all  elements  of  the  matrices 
a  general  numbering  according  to  some  standard  system  and  then 
write  them  out  in  one  row  or  one  column. 

6.  The  space  of  continuous  functions.  Let  us  take,  on  the  real 
line,  an  arbitrary  interval  xi  ^  x  ^  X2  and  denote  by  L  the  set  of 
all  functions  that  are  continuous  on  this  interval  and  that  assume 
real  values.  Bearing  in  mind  that  element  x  of  L  is  a  certain  con¬ 
tinuous  function  x(x),  xt  ^  t  ^  t2,  we  will  write  x  =  {a:(t)}.  Re¬ 
garding  x  as  arbitrary,  let  us  consider  another,  also  arbitrary,  ele¬ 
ment  y  —  {y( t)}.  The  elements  x  and  y  will  be  regarded  as  equal 
if  and  only  if  jc(x)5=  y(x),  that  is,  when  x(x)  and  y( x)  coincide  at 
any  point  x  of  the  interval  Xi  ^  x  ^  X2.  We  define  linear  ope¬ 
rations  in  L  by  setting 

x  +  y={x(r)  +  y  (x)},  ca  =  (ax  (x)}  (6) 

where  a  is  a  real  number.  In  other  words,  we  add  functions  and 
multiply  them  by  scalars  in  the  usual  way  as  accepted  in  analysis. 
It  is  essential  to  point  out  that  adding  continuous  functions  and 
multiplying  a  continuous  function  by  a  constant  yield  continuous 
functions.  It  is  easy  to  see  that  the  linear  operations  (6)  satisfy 
the  axioms  (l)-(8).  Here,  the  zero  element  0  is  a  function  equal  to 
zero  at  all  points  x  of  the  interval  [xj,  X2].  The  inverse  of  an  ele¬ 
ment  x  =  (*(x)}  is  { — at (x) }.  Thus,  the  set  of  all  real  functions 


22 


I. INEAR  SPACES 


[CH.  I 


continuous  on  x\  ^  x  ^  x2  together  with  the  linear  operations  (6) 
is  a  real  linear  space. 

If  for  L  we  take  the  set  of  all  functions  continuous  on  x\  ^  x  ^ 
sc:  T2  and  having  complex  values,  that  is,  functions  of  the  form 
x(x)  —  u(x)-\-  iv(x),  then  in  this  set  we  can  specify  the  linear  ope¬ 
rations  (6)  for  a  complex  a.  All  axioms  (l)-(8)  are  again  satisfied 
and  we  obtain  a  complex  linear  space  of  continuous  functions  with 
complex  values.  Here  too,  like  in  the  examples  examined  in  Subsec¬ 
tions  4  and  5,  we  can  make  the  set  of  continuous  functions 

x  (t)  =  u  (x)  +  iv  (x) 

a  real  linear  space  if  in  (6)  we  admit  only  real  numbers  for  a. 

7.  The  space  of  integrable  functions.*  We  consider  all  real¬ 
valued  functions  integrable  on  the  interval  Xi  ^  x  =£:  x2  and  de¬ 
note  the  set  of  these  functions  by  L. 

It  will  be  recalled  that  if  we  change  an  integrable  function  at 
one  point  in  any  way  whatsoever  (retaining  the  remaining  values) 
then  the  function  will  remain  integrable,  and  the  integral  of  the 
function  will  be  equal  to  the  same  number  as  prior  to  the  change. 
The  same  goes  if  the  function  is  changed  at  several  points,  even 
at  an  infinitude  of  points,  provided  that  this  set  is  of  measure  zero. 
From  the  viewpoint  of  integration  theory,  such  changes  in  the 
function  are  not  essential.  For  this  reason,  in  questions  of  integra¬ 
tion  theory  it  is  not  desirable  to  distinguish  between  two  functions 
if  they  coincide  on  the  interval  Xi  ^  x  ^  T2  almost  everywhere, 
that  is,  at  all  points  of  Xi  ^  x  ^  x2  except  possibly  for  a  set  of 
measure  zero. 

In  this  connection,  we  agree  to  consider  as  equal  two  elements 
x  =  {jc(x)},  y  —  {y(x)}  of  the  set  L  is  x(x)  =  y(x)  almost  every¬ 
where  on  the  interval  xi  ^  x  ^  X2.  Accordingly,  an  admissible  re¬ 
placement  of  an  arbitrary  element  x—  (x(t)}  ei  consists  in  any 
change  in  the  values  of  the  function  x(x)  on  any  set  of  measure 
zero. 

11  will  readily  be  seen  that  this  definition  of  equality  of  elements 
of  L  satisfies  the  three  requirements  of  Subsection  3,  Section  1. 
For  the  first  two  it  is  obvious.  Now  let  y  ==  x,  that  is,  y{x)  =  x(x) 
everywhere  except  for  a  certain  set  jK\  of  measure  zero.  Let  z  =  x, 
that  is,  ^ (t)  =  x (t)  everywhere  except  for  a  certain  set  J[ 2  of 
measure  zero.  Then  y(x)=  z(x)  everywhere  except  possibly  for  the 
union  of  the  sets  ,H\  and  Jf2.  But  the  union  of  two  sets  of  measure 
zero  is  a  set  of  measure  zero.  Consequently,  r/(x)=z(t)  almost 


*  This  subsection  may  lie  skipped  if  the  reader  is  not  familiar  with  the 
theory  of  inti’ifrat ion. 


$  31  COROLLARIES  TO  AXIOMS  OF  LINEAR  SPACE  23 

everywhere  and,  hence,  y  —  z.  Thus,  the  third  requirement  is  satis¬ 
fied:  if  y  =  x,  z  =  x,  then  y  =  z. 

If  in  the  set  L  we  define  the  linear  operations  according  to  the 
formulas  (6)  of  Subsection  6,  then  invariance  of  the  linear  opera¬ 
tions  relative  to  admissible  replacements  will  be  ensured  and  all 
axioms  ( 1 ) - (8)  will  be  seen  to  hold  true.  We  will  not  dwell  on  the 
proof  of  these  circumstances  and  will  merely  note  that  in  the  given 
case  the  zero  element  is  0  =  {0(t)}  where  0(t)  is  any  function 

equal  to  zero  almost  everywhere  on  the  interval  [t|,  T2]. 

The  set  L  together  with  the  specified  linear  operations  is  termed 
the  space  of  functions  integrable  on  the  interval  [n,  *2]- 

8.  Counterexample.  Denote  by  L  the  set  of  all  ordered  n-tuples 
of  real  numbers  (n  >  1),  that  is,  a  set  of  the  same  kind  as  in 
Subsection  3.  Let  us  define  the  sum  of  two  elements  of  L  in  the 
same  way  as  was  done  in  Subsection  3: 

x  +  y  =  {*1  +  f/i,  x2  +  #2.  •  •  • .  xn  -f  ;/„}  (7) 

Let  multiplication  of  x  by  a  be  given  by  the  rule 

ax  —  {ax,,  *2 . xn }  (8) 

(on  the  right  side,  only  X\  is  multiplied  by  a).  The  axioms  (l)-(4) 
are  satisfied  by  (7),  and 

0  =  {O,  0,  ....  0},  —  *  =  {—*,,  —x2,  —  *„} 

It  is  also  easy  to  verify  that  the  requirements  of  axioms  (5),  (6), 
(8)  are  satisfied,  yet  axiom  (7)  does  not  hold: 

(a  +  P)  x  =  {(a  +  P)  x,,  x2 . xn }, 

ax  -f  p*  =  {(a  -f  P)  x,,  2*2,  . . . ,  2x„} 

Thus,  the  set  L  with  operations  (7),  (8)  is  not  a  linear  space. 

§  3.  Elementary  corollaries  to  the  axioms  of  a  linear  space 

1.  Let  us  now  examine  the  general  theory,  that  is,  the  conclu¬ 
sions  that  follow  from  axioms  (l)-(8)  irrespective  of  the  particu¬ 
larities  of  specific  linear  spaces.  The  following  propositions  hold 
true. 

(1)  In  every  linear  space  there  is  only  one  zero  vector. 

Proof.  Suppose  the  elements  0i  and  02  are  zero  elements.  By 
axioms  (1)  and  (3)  they  coincide: 

02  =  02  -)-  0!  ==  0|  02  ~  01 

Remark.  When  we  say  that  there  is  only  one  zero  vector,  we 
mean  that  we  do  not  distinguish  between  equal  vectors.  Unique- 


24  LINEAR  SPACES  [CH.  1 

ness  is  (o  be  understood  in  the  same  way  in  the  other  theorems  as 
well,  for  instance,  in  the  following  proposition. 

(2)  For  any  vector  x  there  is  only  one  opposite  vector. 

Proof.  Suppose  that  x-f£/i  =  0  and  that  *-fj/2  =  0.  Axioms 

(l)-(4)  permit  writing  down  the  following  chain  of  equations: 

yi  =  iji  +  0  =  t/2  +  (*  +  y\)  =  (y-i  +  x)  +  y{ 

=  (x  +  y2)  +  */,  =  0  +  y\  =  i/i 

which  means  y2  =  y\- 

(3)  The  product  of  any  vector  x  by  the  number  0  is  equal  to  the 
zero  vector  0. 

Proof.  For  a  given  vector  x  take  the  opposite  vector  y.  Using 
axioms  (2) - (5)  and  (7),  we  get 

O-x  =  O-  A:-l-0  =  O-  .*:  +  (.x-fi/)==(O+l)A:  +  r/  =  £  +  #  =  0 

(4)  The  product  of  any  vector  x  by  the  number  — 1  is  equal  to 
the  vector  opposite  to  x,  that  is,  ( — l)x  =  — x. 

Proof.  It  is  required  to  establish  that  x-f( — 1)jc  =  0.  From  the 
preceding  property  and  from  axioms  (5)  and  (7)  we  have 

x  +  (—  !)*  =  (!  -  1)*  =  0  •  x  —  0 


(5)  The  product  of  the  zero  vector  0  by  any  scalar  a  is  equal  to 
the  zero  vector. 

Proof.  Take  an  arbitrary  vector  x.  Using  axiom  (6)  and  pro¬ 
perty  (3),  we  find 

a0  =  a  (0  •  x)  =  (a  •  0)  x  =  0  •  x  =  0 


2.  Remarks.  (1)  From  property  (5)  it  follows  that  the  product 
of  a  nonzero  vector  by  a  nonzero  scalar  always  yields  a  nonzero 
vector.  Indeed,  if  for  l  ¥=  0,  a  0  it  were  true  that  la  =  0,  then 
because  of  property  (5)  and  axioms  (5)  and  (6),  we  would  have 

a  —  1  •  a  =  ^  •  A, )  a  =  -j-  (la)  —  j-  0  =  0 

which  contradicts  the  condition  a^0. 

(2)  If  a  p  and  a  0,  then  aa  =£  pa.  Indeed,  if  it  turned  out 
that  aa  —  pa  then  it  would  be  true  that  aa  +  ( — p)a  =  0,  or 
(a  —  p)a  —  0,  which  runs  counter  to  the  foregoing,  since  a— p^O 
and  a  ¥=  0. 

3.  The  operation  of  subtraction  is  defined  in  a  linear  space.  Na¬ 
mely,  a  vector  x  is  called  the  difference  between  a  vector  b  and  a 
vector  a  if  x  -f  a  =  b  and  we  write  x  =  b  —  a. 

We  will  prove  that  for  arbitrary  elements  a  and  b  a  difference 
exists  and  it  is  unique. 


$  4]  LINEAR  COMBINATIONS.  LINEAR  DEPENDENCE  25 

Existence.  We  prove  that  the  vector  x  —  6-f(— l)a  is  the 
difference  b  —  a.  By  axioms  (2),  (3),  (5),  (7)  and  property  (3)  we 
have 

x  -(-  a  =  b  -j-  ( —  1 )  a  -f-  ci  =  b  ( —  l  — |—  1 )  o =  b  -f-  0  •  cl  =  b 

Uniqueness.  We  show  that  if  x  is  the  difference  b  —  a,  then  it 
can  always  be  represented  in  the  form  x  =  b+(—  l)a.  Indeed, 
from  the  equation  x  +  a  =  b  we  get,  with  the  aid  of  axioms  (2), 
(3),  (5),  (7)  and  property  (3), 

x  =  *4-0  =  -T  +  (l  —  l)a  =  jt-fa  +  ( —  1)  a  =  6  +  ( —  1  )a 

4.  In  the  sequel,  we  will  make  use  of  the  axioms  of  linear  space 
and  of  the  properties  established  in  this  section  without  detailed 
explanations.  Due  to  the  axioms  and  the  results  obtained  here, 
computations  involving  elements  of  a  linear  space  are  carried  out 
in  a  manner  similar  to  the  manipulations  of  elementary  algebra, 
with  the  sole  difference  that  there  is  no  multiplication  and  division 
of  vectors  and  one  must  distinguish  between  the  number  zero  and 
the  zero  vector. 

In  particular,  we  can  transpose  a  vector  from  one  member  of 
a  vector  equation  to  the  other  by  multiplying  that  vector  by  minus 
one  (or,  what  is  the  same  thing,  by  replacing  it  by  the  opposite 
vector). 


§  4.  Linear  combinations.  Linear  dependence 

1.  Given  a  finite  number  of  elements  of  a  linear  space: 
a,  b,  c,  . . . ,  q.  Also,  let  a,  p,  y.  •  ■  •  >  x  be  arbitrary  scalars. 

Definition  1.  Any  element  x  of  the  space  L  that  can  be  repre¬ 
sented  as 

x  —  aa  -f  pi>  -f  \c  +  ...  -f  xq 

is  called  a  linear  combination  of  the  elements  a,b,c,  ....  q.  We 
also  say  that  x  is  expressed  linearly  in  terms  of  a,  b,  c,  ... ,  q. 

Definition  2.  A  linear  combination  is  termed  trivial  if  a  —  p  = 
=  y  =  . . .  ==  x  =  0  and  nontrivial  if  there  is  at  least  one  nonzero 
scalar  among  the  scalars  a,  p,  . . . ,  x. 

Definition  3.  A  system  (set)  of  vectors  a,b,c,  ....  q  is  said  to 
be  linearly  dependent  if  there  is  a  nontrivial  linear  combination  of 
vectors  a,  b,  c,  . . . ,  q  equal  to  the  zero  vector,  in  other  words,  if 
it  is  true  that 

aa  +  p&  +  Yc  +  •  •  •  +  y.q  —  8 

where  there  is  at  least  one  nonzero  scalar  among  the  scalars 
«,  P,  Y . x- 


2C> 


LINEAR  SPACES 


[CH.  I 


Definition  4.  A  system  of  vectors  a,b,c,  . . . ,  q  is  said  to  be 
linearly  independent  if  the  equation 

aa  -f  06  +  yc  +  ...  -f  xq  —  0 

is  possible  only  if 

a  =  P  —  Y  =  ...=x  =  0 

2.  Let  us  consider  the  properties  of  the  foregoing  concepts. 

(1)  It  follows  directly  from  the  definitions  that  any  finite  system 
of  vectors  is  either  linearly  dependent  or  linearly  independent.  We 
will  show  that  a  system  consisting  of  one  vector  is  linearly  depen¬ 
dent  if  and  only  if  this  vector  is  a  zero  vector. 

Indeed,  the  equation  a0  =  0  for  any  a,  in  particular  for  a  ¥=  0, 
was  established  in  Subsection  1  of  Section  3.  Now  let  x  =/=  0  and 
ax  =  0.  Then  a  =  0  in  accordance  with  Subsection  2  of  Section  3. 

(2)  If  part  of  a  system  is  linearly  dependent,  then  the  whole 
system  is  linearly  dependent. 

Suppose  it  is  known  that  a  part  of  the  system  a,b,c,  ...,  q  — 
consisting  of  the  vectors  c,  ....  q  —  is  linearly  dependent.  This 
means  that  there  exist  scalars  y,  . . . ,  x  not  all  equal  to  zero  and 
such  that  yc  +  ■  •  ■  +  *<7  =  0-  But  then  the  linear  combination 
0-a  +  0 -b  -f  Yc  +  ■  •  •  +  xq  =  0  is  nontrivial  since  there  are  non¬ 
zero  scalars  among  y,  .  .  .  ,  x. 

(3)  If  an  entire  system  of  vectors  is  linearly  independent,  then 
so  also  is  any  part  of  that  system. 

This  follows  directly  from  the  preceding  property.  In  particular, 
the  zero  vector  cannot  enter  into  a  linearly  independent  system. 

(4)  If  a  system  is  linearly  dependent,  then  there  will  be  at  least 
one  vector  in  it  that  will  be  expressed  linearly  in  terms  of  the  re¬ 
maining  vectors. 

Indeed,  if  aa  -f  pb  -f-  yc  -f-  •  • .  +x</  =  0  and  there  are  nonzero 
coefficients  among  a,  p,  . . . ,  x,  then  any  one  of  the  vectors  having 
nonzero  coefficients  may  be  expressed  linearly  in  terms  of  the 
remaining  vectors  of  the  system.  For  instance,  if  a  0,  then 


By  property  (4)  this  is  not  only  necessary  but  also  sufficient  for 
the  linear  dependence  of  the  system  of  vectors;  namely,  the  follow¬ 
ing  assertion  holds  true. 

(5)  If  some  clement  of  a  system  is  a  linear  combination  of  the 
remaining  elements,  then  the  system  is  linearly  dependent. 

Indeed,  if 

a  =  $'b  +  y'c  +  . . .  +  x'q 
1  •«  +  (-P>  +  (-Y,)c+  ...  +  ( —  x')q  ~Q 


then 


•  5| 


LEMMA  ON  THE  BASIS  MINOR 


27 


and  the  linear  combination  in  the  right  member  of  the  last  equa¬ 
tion  is  nontrivial. 

(6)  Let  a.\,  . . . ,  a*,  be  certain  vectors.  Suppose  each  one  of  the 
vectors  c u  c2 . cn  is  expressed  linearly  in  terms  of  aj . a*: 

c,  =a na,  +  ...  +alkak, 

c2  —  a2ia,  +  a2kak. 


Cn  —  an\a\  +  •••  -\-rtnkak 

Furthermore,  let  vector  b  be  linearly  expressed  in  terms  of 
Qi,  •  •  ■  >  C\,  ,  cn: 

6==X|a|+  ...  +  LkOk  +  +  •••  4-  \incn 

Then  vector  b  can  be  linearly  expressed  in  terms  of  the  vectors 

fl|,  .  .  .  ,  flft. 

Proof. 

b  —  (A,|  -f  jxict,  |  4-  ...  +  (x„a„|)a|  4* 

•  •  •  +  (hk  +  Hictu  +  •  •  •  +  a* 


§  5.  Lemma  on  the  basis  minor 

1.  Suppose  we  have  a  rectangular  matrix  A  —  ||a,j||.  We  will 
regard  the  rows  of  the  matrix  as  vectors  of  the  coordinate  space 
Kn,  and  the  columns  as  vectors  of  the  coordinate  space  Km  (see 
Section  2,  Subsections  3  to  5).  Then  we  can  speak  about  the  linear 
dependence  or  independence  of  the  rows  of  the  matrix  or  about  the 
linear  dependence  or  independence  of  the  columns. 

2.  Let  us  mark  k  distinct  rows  and  k  distinct  columns  of  the 
matrix  A  (k  ^  n,  k  ^  m).  The  elements  *  of  matrix  A  lying  at 
the  intersections  of  the  marked  rows  and  columns  form  a  certain 
(clearly,  square)  matrix  B.  The  determinant  of  matrix  B  is  called 
a  minor  of  order  k  of  the  given  matrix  A. 

Now  mark,  if  possible,  one  more  row  and  one  more  column  of  A 
without  repeating  the  ones  already  marked.  Now,  all  marked  rows 
and  columns  intersect  to  form  a  certain  square  matrix  C. 

The  determinant  of  the  matrix  C  is  a  minor  of  order  k  +  1  of  the 
matrix  A.  With  respect  to  the  original  minor  (that  is,  the  deter¬ 
minant  of  matrix  B),  it  is  a  bordering  minor. 


*  The  elements  of  a  matrix  are  the  numbers  that  compose  it:  an.  ai2 . 

However,  it  would  be  more  exact  to  say  that  the  elements  of  a  matrix  are  the 

symbols  a,i.  a,2 . Here,  two  elements  «,<,  and  an  are  taken  to  be  distinct 

if  i  #  /  or  k  l  (without  precluding  the  possibility  that  an,  and  aji  denote 
the  same  number).  Also  note  that  in  a  number  of  cases,  matrices  are  conside¬ 
red  in  which  an,  are  not  numbers  but  other  entities,  for  example,  functions. 


28 


LINEAR  SPACES 


[CH.  1 


Remarks.  (1)  If  k  =  n  or  k  =  m,  then  there  are  no  bordering 
minors  for  minors  of  order  k. 

(2)  If  k=l,  the  matrix  B  consists  of  a  single  element  of 
matrix  A.  The  minors  of  the  first  order  are  the  numerical  values 
of  the  elements  of  the  matrix. 

3.  Definition  1.  A  minor  of  a  matrix  is  called  a  basis  minor  if 
it  is  not  equal  to  zero  and  the  bordering  minors  are  either  equal 
to  zero  or  are  absent  altogether. 

Definition  2.  The  columns  of  the  matrix  intersecting  the  basis 
minor  are  called  basis  columns.  The  terminology  is  similar  for 
rows,  in  which  case  we  have  basis  rows. 

Remark.  A  matrix  can  have  several  basis  minors  and,  accor¬ 
dingly,  several  systems  of  basis  columns.  Every  matrix,  except 
the  zero  matrix,  has  at  least  one  basis  minor  and  thus  at  least  one 
system  of  basis  columns. 

4.  Lemma  on  the  basis  minor.  The  columns  of  a  matrix  that  in¬ 
tersect  the  basis  minor  are  linearly  independent.  Every  column  can 
be  linearly  expressed  in  terms  of  them. 

By  definition  2,  this  lemma  can  be  expressed  thus: 

The  basis  columns  are  linearly  independent.  Any  column  of  a 
matrix  can  be  expressed  linearly  in  terms  of  the  basis  columns. 

Proof.  The  proof  of  the  first  assertion  is  by  reductio  ad  absur- 
dum.  Suppose  that  the  basis  columns  are  linearly  dependent.  Then 
the  columns  of  the  basis  minor  are  also  linearly  dependent,  but 
then  the  basis  minor  is  equal  to  zero,  which  runs  counter  to  the 
definition. 

Proof  of  the  second  assertion.  For  the  sake  of  definiteness,  we 
assume  that  the  basis  minor  is  of  order  r  and  occupies  the  upper 
left-hand  corner  of  the  matrix: 


3 II  •  •  • 

U\r 

. . .  a\k  ... 

a\n 

ar\  ... 

25 

-I 

...  Grk  .  ,  • 

Qrn 

am\  •  • 

(Imr  •  •  •  Amt  •  •  • 

amn 

Denote  this  basis  minor  by  D. 

Take  arbitrary  indices  i,  k  (1  i  sg:  m,  1  sg:  k  ^  n)  and  form 
a  determinant  of  order  r  -|-  1: 


LEMMA  ON  THE  BASIS  MINOR 


29 


I  5] 

We  will  prove  that  Aj*  =  0.  Let  us  consider  three  possible  cases: 

(1)  i  sg:  r.  In  this  case,  A,*  =  0  since  here  the  last  row  coin¬ 
cides  with  one  of  the  preceding  rows. 

(2)  k  ^  r.  In  this  case,  A,&  =  0  since  here  the  last  column  coin¬ 
cides  with  one  of  the  preceding  ones. 

(3)  i  >  r,  k  >  r.  In  this  case,  the  determinant  A,t,  is  a  border¬ 
ing  determinant  with  respect  to  the  minor  D  and  is  equal  to  zero 
because  D  is  a  basis  minor. 

Fix  k  and  assume  that  f  runs  over  all  possible  values  from 
1  to  m. 

Let  us  expand  A,>,  by  the  elements  of  the  last  row.  Denote  the 
cofactors  of  the  elements  of  the  last  row  by  A]t  A2,  ....  Ar+\.  As  i 
varies,  these  quantities  remain  unchanged  since  the  cofactor  of  an 
element  depends  only  on  the  position  it  occupies  in  the  determi¬ 
nant  but  does  not  depend  on  the  numerical  value  of  the  element 
itself.  The  expansion  yields 

^■ik  —  A[al{-\-  ...  Arair Ar+\Ciik  =  0  (1) 

here 

Ar+l  =  D¥>  0  (2) 

The  relations  (I)  and  (2)  yield 

a,4==(“  Tr)a''  +  •••  +(“ 

Recall  that  k  is  fixed  and  ranges  over  all  values  from  1  to  m, 
therefore 


1 

a,. 

Q|r 

• 

• 

Qmk 

II 

T 

+  •••  +  (~4l) 

amr 

(3) 

Formula  (3)  represents  the  &th  column  (which  may  be  taken 
arbitrarily)  of  the  matrix  in  the  form  of  a  linear  combination  of 
the  basis  columns.  This  completes  the  proof  of  the  lemma. 

Remark.  A  similar  lemma  naturally  holds  true  for  basis  rows  as 
well. 

5.  As  a  corollary  to  the  lemma  on  the  basis  minor,  we  have  the 
following  theorem. 

Theorem.  The  determinant  of  a  square  matrix  is  equal  to  zero 
if  and  only  if  there  is  a  linear  dependence  between  the  columns  of 
the  matrix.  A  similar  assertion  holds  true  for  rows. 

Proof.  If  the  columns  of  an  n  X  n  matrix  are  dependent,  then  its 
determinant  is  equal  to  zero.  This  is  one  of  the  basic  properties  of 
determinants.  We  will  show  that  if  the  columns  are  independent, 
then  the  determinant  is  not  equal  to  zero.  Indeed,  if  the  columns 


30 


LINEAR  SPACES 


[CH.  r 


are  independent  and  the  determinant  is  equal  to  0,  then  there  must 
he  a  basis  minor  M  of  order  less  than  n.  But  then  there  is  a  co¬ 
lumn  that  does  not  enter  into  the  system  of  basis  columns  (the 
system  corresponding  to  the  minor  Af)  and  that  can  be  linearly 
expressed  in  terms  of  the  system,  that  is,  there  is  a  dependence 
between  the  columns.  But  this  contradicts  the  assumption. 

§  6.  Basic  lemma  on  two  systems  of  vectors 

1.  Let  there  be  given  two  systems  of  vectors  oj,  a 2,  . . . ,  ah  and 
b\,  £>2,  ■  ■  • ,  bm  in  one  and  the  same  linear  space. 

Lemma  I  (basic).  If  the  system  bx,  b2 . bm  is  linearly  inde¬ 

pendent,  and  each  of  the  vectors  b{  is  linearly  expressible  in  terms 
of  the  system  a i,  a2,  .  . . ,  as  then  m  ^  k. 

Proof  (by  contradiction ).  Suppose  that  m>k.  Write  down  the 
formulas  that  express  the  vectors  b{  in  terms  of  the  vectors  ay 


b  i 

—  anai 

+ 

■  •  •  4-  o.xkak, 

b2 

—  a2lal 

+ 

•  •  •  4"  «2  kak> 

bm- 

am—  1 

l«i  + 

•  •  •  “1"  1  k^L  t 

bm 

=  amlal 

+ 

and  consider  the  matrix  Hajjll.  If  the  matrix  Ha.-jll  is  a  zero  matrix, 
then  bx  =  . . .  =  bm  =  0  and  the  system  b\,  . . . ,  bm  is  linearly  de¬ 
pendent,  which  contradicts  the  hypothesis.  Suppose  the  matrix 
II a, -j II  is  nonzero.  Then  it  has  a  basis  minor  and  the  order  of  the 
basis  minor  does  not  exceed  the  number  of  columns  k.  The  number 
of  rows  in  the  matrix  ||a,j||  is  m  and  is  greater  than  k  and,  conse¬ 
quently,  is  greater  than  the  number  of  basis  rows  in  the  matrix. 

Thus,  the  matrix  ||a,j||  has  a  certain  system  of  basis  rows  and 
also  at  least  one  row  that  does  not  enter  into  the  system.  Accord¬ 
ing  to  the  basis-minor  lemma,  the  indicated  row  can  be  linearly 
expressed  in  terms  of  the  basis  rows.  But  then  this  means  that 
there  is  a  linear  dependence  between  the  rows  of  the  matrix  (see 
Section  4,  Subsection  2,  Items  (5)  and  (2)).  We  write  it  in  the 
form 

A|  {'hi'  •  •  •  >  ®l/i}  "4*  •  •  •  *4*  {am!>  •  ■  •  i  G/nt}  =  {0>  •  •  •  i  0}  (2) 

where  there  are  nonzero  scalars  among  Xi,  . . . , 

Multiply  equations  (1)  by  Ai,  ...,  Xm  respectively  and  add  them 
termwise.  Taking  into  account  the  linear  dependence  (2),  we  find 

Xxl>[-\-  ...  4  Xmbm  =  a,  (A,|Ctn  +  •••  +^maml)  + 

...  4  a*  (A,|CX|*.  ...  +  Xmamk)  ==  0  •  at  +  •••  +0-afc  =  Q 


$  G|  BASIC  LEMMA  ON  TWO  SYSTEMS  OF  VECTORS  3] 

The  system  b\ ,  ....  bm  proved  to  be  linearly  dependent,  but  this 
is  impossible  by  the  hypothesis  of  the  lemma.  The  resulting  con¬ 
tradiction  demonstrates  the  proof  of  Lemma  1. 

2.  We  say  that  the  vectors  ai{,  a,r  form  a  linearly  inde¬ 
pendent  subsystem  in  the  system  ai,  ah  (k^r)  if  the  vectors 

air  ....  a,r  are  linearly  independent  and  enter  into  the  system 

<*1,...,  a,,. 

Clearly,  the  system  at,  . .  .  ,  ah  contains  at  least  one  linearly  in¬ 
dependent  subsystem  if  and  only  if  there  is  at  least  one  nonzero 
vector  among  the  vectors  a i,  .  . . ,  a/,. 

3.  Lemma  2.  Suppose  the  system  of  vectors  a i,  ....  ar,  ar+\  is 
linearly  dependent  and  its  subsystem  a\,  .  .  . ,  ar  is  linearly  indepen¬ 
dent.  Then  the  vector  ar+i  can  be  expressed  linearly  in  terms  of  the 
vectors  a i,  . . . ,  a,. 

Proof.  We  have  the  dependence 

+  •••  +  krar  +  kr+lar+l  =  0  (3) 

where  among  the  scalars  Xlt  . . . ,  Xr,  Xr+i  there  are  some  different 
from  zero.  It  is  clear  that  Ar+i  cannot  be  equal  to  zero  since  in  that 
case  the  subsystem  a\,  . . . ,  ar  would  be  a  dependent  subsystem. 
Thus,  Kr+i  ¥=  0,  and  from  (3)  we  get 

a'+|  =  (-T7T7)fl'  +  •••  + 

which  is  what  we  set  out  to  prove. 

4.  Definition.  Suppose  a  system  a i,  ...,  ah  contains  a  linearly 
independent  subsystem  consisting  of  r  vectors.  The  number  r  is 
termed  the  rank  of  the  system  a\,  . . . ,  a/,  if  any  subsystem  of  a  lar¬ 
ger  number  of  vectors  is  linearly  dependent,  or  if  there  are  no 
such  subsystems  (when  r  =  k). 

Briefly,  the  rank  of  a  system  is  the  maximum  number  of  its 
linearly  independent  vectors. 

If  all  vectors  of  the  system  a i,  ...,  ak  are  zero  vectors,  we  say 
the  rank  of  the  system  is  zero. 

5.  Lemma  3  (generalized  basic).  Suppose  each  of  the  vectors 

b ],  .  .  .,  bm  is  expressed  linearly  in  terms  of  the  vectors  a i,  ....  a/t. 
Then  the  rank  of  the  system  bu  .  .  .  ,  bm  does  not  exceed  the  rank 
of  the  system  a\ . ah. 

Proof.  Denote  by  r  the  rank  of  the  system  ai,  ....  ak.  If  r  =  0, 
then  the  truth  of  the  assertion  of  Lemma  3  is  obvious.  If  r  =  k, 
the  truth  of  the  assertion  of  Lemma  3  follows  from  Lemma  1.  In- 


LINEAR  SPACES 


[CH.  I 


:t2 

deed,  Ihe  rank  of  the  system  bu  . . . ,  bm ,  by  Lemma  1,  does  not 
exceed  k  —  r. 

Let  0  <  r  <  k.  Then  the  system  at,  ... ,  ah  will  have  r  linearly 
independent  vectors.  Suppose  they  are  the  vectors  a \,  . . . ,  ar.  By 
adjoining  one  more  vector  from  the  system  ai,  . . . ,  a*,  we  will  each 
time  obtain  linearly  dependent  systems,  namely, 

,  *  •  . ,  ar.  + 1  > 

0|,  ><i,  ar,  ar+ 2> 


Ct\t  *••,  ar.  @k 

By  Lemma  2,  each  of  the  vectors  a,+!,  ...,  ah  can  be  linearly 

expressed  in  terms  of  the  vectors  a.\ . ar.  On  the  other  hand, 

the  lemma  states  that  each  vector  bu  . . . ,  bm  is  expressed  linearly 
in  terms  of  all  the  vectors  a\,  . . . ,  an-  From  this  fact  and  from  the 
preceding  conclusion  it  follows  that  each  vector  bi,  ....  bm  is  li¬ 
nearly  expressible  in  terms  of  a\,  •  •  ■ ,  ar  (see  Section  4,  Subsec¬ 
tion  2,  Item  (6)).  But  then,  by  Lemma  1,  the  number  of  vectors  in 
any  linearly  independent  subsystem  of  the  system  b\,  .  . . ,  bm  does 
not  exceed  r.  The  proof  of  Lemma  3  is  complete. 


§  7.  The  rank  of  a  matrix 

1.  Definition.  The  rank  of  a  matrix  is  the  maximum  number  of 
its  linearly  independent  columns. 

In  other  words,  the  rank  of  a  matrix  is  the  rank  of  the  system 
of  its  columns  regarded  as  vectors  of  a  coordinate  space. 

For  the  rank  of  a  matrix  A  we  will  use  the  symbol  “rank  A”. 

If  matrix  A  is  a  zero  matrix,  then  rank  A  =  0  since  a  zero 
matrix  has  no  linearly  independent  columns.  Note  that  the  rank  of 
a  nonzero  matrix  is  always  positive. 

2.  Theorem  on  the  rank  of  a  matrix.  The  rank  of  an  arbitrary 
matrix  is  equal  to  the  maximum  order  of  its  nonzero  minors. 

Proof.  If  rank  A  —  0,  then  A  is  a  zero  matrix  and  it  does  not 
have  any  nonzero  minors.  In  this  case,  it  is  natural  to  consider 
that  the  maximum  order  of  the  nonzero  minors  is  equal  to 
zero. 

Now  suppose  the  matrix  A  is  nonzero.  If  one  of  its  minors  M  of 
order  r  is  not  equal  to  zero,  and  all  higher-order  minors  are  zero 
or  absent,  then  Af  is  the  basis  minor.  By  the  basis-minor  lemma, 
the  columns  of  matrix  A  intersecting  minor  Af  are  linearly  inde¬ 
pendent.  Therefore,  rank  A  ^  r.  By  the  same  lemma,  any  column 
of  uuilrix  A  can  be  expressed  linearly  in  terms  of  the  basis  co- 


THE  RANK  OF  A  MATRIX 


33 


o  n 

luinns,  whence,  using  Lemma  3  of  Section  6,  we  find  rank  A  ^  r. 
And  so  rank  A  =  r,  thus  completing  the  proof. 

3.  A  number  of  important  corollaries  follow  from  the  reasoning 
carried  out  in  the  preceding  subsection. 

(1)  The  rank  of  a  nonzero  matrix  is  equal  to  the  order  of  any 
one  of  its  basis  minors. 

Indeed,  if  Af  is  an  arbitrary  basis  minor  and  r  is  its  order,  then, 
repeating  the  foregoing  arguments,  we  see  that  rank  A  —  r. 

(2)  All  basis  minors  of  a  nonzero  matrix  have  the  same  order, 
which  is  equal  to  the  rank  of  the  matrix. 

(3)  If  in  matrix  A  the  minor  M  is  a  basis  minor,  then  all  minors 
of  higher  order  (and  not  only  the  minors  bordering  M)  are  equal 
to  zero. 

(4)  The  maximum  number  of  linearly  independent  rows  of 
an  arbitrary  matrix  A  is  equal  to  the  maximum  number  of 
its  linearly  independent  columns  ( that  is,  it  is  equal  to  the 
rank  of  A). 

Proof.  If  A  is  a  zero  matrix,  the  number  of  linearly  independent 
rows  and  the  number  of  linearly  independent  columns  is  zero. 
Let  A  be  a  nonzero  matrix.  Take  the  transpose  of  A.  The  rows  will 
then  become  the  columns  of  the  transposed  matrix  A*,  the  linearly 
independent  rows  become  the  linearly  independent  columns  of  A*, 
and  the  maximum  order  of  nonzero  minors  is  preserved  because, 
in  a  transposition,  each  of  the  minors  preserves  its  numerical  va¬ 
lue.  Thus 

rank  A  =  rank  A* 

and  is  equal  to  the  maximum  number  of  linearly  independent  rows 
of  matrix  A. 

(5)  If  A  is  an  arbitrary  m  X  n  matrix,  then  rank  A  does  not 
exceed  the  smaller  of  the  two  numbers  m  and  n. 

4.  It  is  clear  from  the  foregoing  that  the  rank  of  a  matrix  does 
not  change  in  an  interchange  of  its  columns  or  rows. 

Besides,  from  the  lemmas  of  Section  6  it  follows  that  the  rank 
of  a  matrix  does  not  change  if  to  one  of  the  columns  we  add  a  li¬ 
near  combination  of  the  other  columns. 

Similarly,  the  rank  is  preserved  if  to  one  of  the  rows  we  add  a 
linear  combination  of  the  other  rows. 

The  properties  enumerated  in  this  subsection  are  ordinarily  used 
for  computing  the  rank  of  a  matrix.  Namely,  a  given  matrix  is 
transformed  so  that  the  rank  remains  unchanged  but  the  matrix 
is  changed  to  one  in  which  the  basis  minor  is  immediately  appa¬ 
rent. 


2  —  1)6! 


LINEAR  SPACES 


[Cl  l.  1 


nt 


§  8.  f  inite-dimensional  and  infinite-dimensional  spaces.  Bases 

1.  Definition  I.  A  linear  space  is  said  to  be  n-dimensional  if  it 
lias  a  linearly  independent  system  consisting  of  n  vectors,  and  any 
system  consisting  of  a  larger  number  of  vectors  is  linearly  depen¬ 
dent. 

The  number  n  is  called  the  dimension  of  the  linear  space.  Thus, 
the  dimension  of  a  space  is  the  largest  number  of  its  linearly  in¬ 
dependent  vectors. 

For  example,  the  space  of  geometric  vectors  (see  Section  2,  Sub¬ 
section  1)  is  three-dimensional  since  it  has  three  independent 
vectors,  and  any  four  vectors  are  related  by  a  linear  dependence. 
Geometric  vectors  located  in  one  plane  form  a  two-dimensional 
space,  in  which  any  two  noncollinear  vectors  are  linearly  indepen¬ 
dent  and  any  three  vectors  are  linearly  dependent.  Vectors  lying 
on  a  single  straight  line  form  a  one-dimensional  space.  A  linear 
space  containing  the  zero  vector  0  as  its  sole  element  is  called 
a  zero-dimensional  linear  space. 

2.  All  n-dimensional  spaces  (n  =  0,  1,  2,  3,  . . .)  form  the  class 
of  finite-dimensional  spaces.  But  this  does  not  exhaust  the  set  of 
all  linear  spaces. 

Definition  2.  A  linear  space  is  said  to  be  infinite-dimensional 
if  for  any  integer  N  >  0  there  exists  in  it  a  linearly  independent 
system  consisting  of  N  vectors. 

Example.  The  linear  space  of  functions  continuous  on  a  given 
interval  (see  Section  2,  Subsection  6)  is  an  infinite-dimensional 
space.  To  see  that  this  is  so,  it  suffices  to  consider  the  power  func¬ 
tions  1,  t,  t2,  . . . ,  t*. 

It  is  easy  to  establish  their  linear  independence.  Any  linear  com¬ 
bination  of  them  is  a  polynomial  of  degree  not  higher  than  N: 

ao-F  a,t  -f  a2T2+  ...  +avrv  =  p(T) 

But  a  polynomial  with  nonzero  coefficients  has  only  a  finite 
number  of  roots,  therefore  p( t)  =  0,  i.e.,  {p (x) }  =  0  if  and  only  if 

a0  =  a,  =  o2  —  ...  =  d,v  =  0 

It  has  thus  been  demonstrated  that  the  elements  under  considera¬ 
tion  are  independent  and  the  space  is  infinite-dimensional  since 
the  number  N  may  be  arbitrarily  great. 

3.  We  now  introduce  a  definition  that  will  be  very  important  for 
what  follows. 

Definition  3.  A  system  of  vectors  eu  ....  e„  in  a  space  L  is  cal¬ 
led  a  basis  if: 

(1)  the  vectors  e i,  . . . ,  en  are  linearly  independent; 


S  8]  FINITE-  AND  INFINITE-DIMENSIONAL  SPACES  35 

(2)  any  vector  x  in  L  can  be  expressed  linearly  in  terms  of 
. . .  e,„  that  is, 

x  =  x,e,  +  . . .  +xnen  (1) 

An  equation  like  (1)  is  called  an  expansion  of  the  vector  x  in 
terms  of  the  basis  eit  . . . ,  e„\  the  numerical  coefficients  xu  . . . ,  xn 
are  termed  the  coordinates  (or  components)  of  the  vector  x  rela¬ 
tive  to  that  basis. 

4.  Theorem.  A  linear  space  is  said  lo  be  n-dimensional  if  and 
only  if  it  has  a  basis  consisting  of  n  vectors. 

Proof.  (1)  Let  a  space  L  be  n-dimensional.  This  means  that  it 
has  a  linearly  independent  system  of  n  vectors  et,  .  . . ,  e„,  and  that 
if  we  add  to  it  an  arbitrary  vector  x  of  L,  we  get  the  linearly  de¬ 
pendent  system  eu  ... ,  e„,  x.  By  Lemma  2,  Section  6,  the  vector  x 
is  linearly  expressible  in  terms  of  the  vectors  e\,  . . . ,  e„.  There¬ 
fore,  the  system  of  vectors  eu  . . .  ,  en  forms  a  basis  in  L. 

(2)  Let  L  have  the  basis  e\,  ....  e„.  In  L  we  consider  an  ar¬ 
bitrary  linearly  independent  system  of  vectors  b\,  . . . ,  bm.  By  the 
definition  of  a  basis,  each  of  the  vectors  bj  can  be  linearly  ex¬ 
pressed  in  terms  of  e,,  .  . . ,  e„.  Therefore,  tn  n  by  virtue  of 
Lemma  1,  Section  6. 

Hence,  any  system  of  vectors  in  L  that  contains  more  than  n 
vectors  is  a  linearly  dependent  system.  At  the  same  time,  the  basis 
<?i,  ....  en  forms  a  linearly  independent  system  containing  n  vec¬ 
tors.  Thus,  the  dimension  of  L  is  equal  to  n. 

Remark.  It  is  apparent  from  the  foregoing  proof  that  in  an 
n-dimensional  space  any  independent  system  of  n  vectors  forms 
a  basis. 

5.  As  an  appendix  to  the  theorem  proved  in  Subsection  4,  let  us 
establish  that  the  coordinate  space  Kn  is  n-dimensional. 

To  do  this  we  consider  in  Kn  the  vectors 

e,  =  {1 ,  0,  ....  0>, 

e>  —  { 0,  1,  •••,  0}, 

=  {0,  0 .  1} 

According  to  the  definition  of  linear  operations  in  K„  (see  Sec¬ 
tion  2,  Subsection  3),  any  vector  in  Kn  can  be  linearly  expressed 
in  terms  of  the  vectors  et,  . . . ,  e„,  namely, 

*  =  {*i.  *2.  •  •  • .  =  -'V',  +  *2^2  +  •  •  •  +  xnen  (3) 

From  this  it  is  clear  that  a  linear  combination  of  the  vectors 
t'i . en  is  equal  to  0  =  {0,  0,  . . . ,  0}  only  when  all  its  coeffi- 


2 


36 


LINEAR  SPACES 


[CH.  I 


cients  are  equal  to  zero.  Hence,  the  vectors  eh  ... ,  en  are  indepen¬ 
dent  and  form  a  basis  for  Kn,  and  so  the  space  Kn  is  n-dimen- 
sional. 


§  9.  Linear  operations  in  components 


1.  Let  a  space  L  be  n-dimensional  with  a  basis  formed  by  the 
vectors  eu  . . . ,  en. 

Theorem  1.  The  resolution  of  a  vector  in  terms  of  the  given  basis 
is  unique. 

Proof.  Suppose  a  vector  x  in  L  has  two  resolutions: 


and 

Then 


x  =  x,e,  +  . . .  +  xnen 
x  =  xlel  +  ...  +xnen 
(x]—xl)ei+  ...  +(xn  —  xn)en  =  Q 


and,  since  the  vectors  of  the  basis  are  linearly  independent,  it 
follows  that 

x,  —  *i  =  ...  =xn  —  xn  =  0 

whence 


x,  —x. 


Xn  Xn 


which  completes  the  proof. 

Corollary.  All  components  of  the  zero  vector  0  are  equal  to  zero 
for  any  choice  of  the  basis: 

0  =  0  •  e,  +  0  ■  e>  +•••-+■  0  •  en  (1) 

Theorem  2.  When  a  vector  is  multiplied  by  a  scalar ,  each  com¬ 
ponent  of  the  vector  is  multiplied  by  that  scalar.  In  adding  two 
vectors ,  we  add  the  corresponding  components. 

Proof.  Given  the  vectors  x,  y.  Expanding  them  in  terms  of  the 
basis,  we  have 

x  =  x,e,+  ...  +xnen, 
y  =  y\e  1+  •••  +  IJnAn 

Let  a  be  an  arbitrary  scalar.  By  the  axioms  of  a  linear  space 
we  have 

ax  =  a(xlel+  ...  +  xnen)  =  (ox,)  e,  -j-  ...  +(a*„)e„ 

Thus,  the  vector  ax  has  components  axi,  ....  axn.  Furthermore, 

•v  [  II  -  (v,e,  +  . . .  +  xnen)  +  (#,<?,  +  . . .  +  ynen)  = 

=  (Jfi  +  yi)ei+  •••  4 -  (xn-\- yn)en 
•  ImI  I1  IIm-  vtrlur  x  -f  y  has  the  components  X\  +  y\ . xn-\-yn. 


LINEAR  OPERATIONS  IN  COMPONENTS 


37 


S  9) 

2.  Let  a,  b,  . . . ,  q  be  an  arbitrary  system  of  vectors  in  L.  Ex¬ 
pand  each  of  them  in  terms  of  the  basis: 

a  =  a,e,  +  . . .  +  anen, 

b  =  6|£|  +  •  •  •  +  bnen,  ^2) 

q  =  <?,<?,  +  ...  +  qnen 

Along  with  the  vectors  (2)  let  us  consider  the  matrix  M  formed  by 
their  components: 

a,  ...  an 
M  =  b[  ...  bn 

<7l  •••  Qn 

The  following  theorem  holds  true. 

Theorem  3.  The  rank  of  the  system  of  vectors  (2)  is  equal  to 
the  rank  of  the  matrix  M. 

Proof.  Suppose  that  the  vectors  of  (2)  are  linearly  related: 

aa  -f-  pf>  -I-  . . .  ~j-  v.q  =  0  (3) 

Then  from  formulas  (1),  (3),  Theorem  1,  and  Theorem  2  we  have 
a  {ai,  ....  a„}  +  p  [bu  . . . ,  b,,}  + 

...  +y.{q . qn)  —  {0 . 0}  (4) 

In  other  words,  the  rows  of  the  matrix  Af  are  linearly  related 
with  the  same  coefficients  a,  p, . . . ,  x.  Conversely,  from  (4)  fol¬ 
lows  (3).  The  reasoning  is  similar  if  instead  of  the  entire  system 
(2)  we  take  some  subsystem  and  the  corresponding  subsystem  of 
rows  of  M  (that  is,  rows  containing  the  components  of  the  vectors 
of  the  chosen  subsystem).  Therefore,  a  subsystem  of  vectors  of  the 
system  (2)  is  linearly  independent  if  and  only  if  the  corresponding 
subsystem  of  rows  of  the  matrix  M  is  linearly  independent.  This 
means  that  the  maximum  number  of  linearly  independent  vectors 
of  the  system  (2)  coincides  with  the  maximum  number  of  linearly 
independent  rows  of  M.  The  proof  of  Theorem  3  is  complete. 

3.  If  the  number  of  vectors  in  the  system  (2)  is  equal  to  n,  M 
becomes  a  square  matrix.  We  then  obtain  the  following  corollary  to 
the  preceding  theorem. 

A  system  of  n  vectors  in  n-dimensional  space  is  linearly  depen¬ 
dent  if  and  only  if  the  determinant  of  the  matrix  of  the  components 
of  the  vectors  is  equal  to  zero: 

A  =  det  M  =  0 

that  is,  if  the  rank  of  the  matrix  M  is  less  than  n. 


LINEAR  SPACES 


[CH.  I 


.'>8 


II  will  readily  be  seen  that  this  assertion  actually  does  not  differ 
from  the  theorem  stated  in  Subsection  5  of  Section  5.  It  is  often 
used  as  a  practical  verification  of  the  linear  dependence  or  inde¬ 
pendence  of  specific  systems  of  vectors. 

§  10.  Isomorphism  between  linear  spaces 

1.  Let  there  be  given  two  linear  spaces  L  and  L'  and  a  one-to- 
one  correspondence  established  between  them,  that  is, 

(1)  to  each  vector  a  in  L  there  is  associated  a  vector  a'  in  L'\ 

(2)  distinct  vectors  in  L  have  distinct  images  in  L'; 

(3)  the  images  of  elements  of  L  fill  L'  completely. 

Definition.  The  spaces  L  and  L'  are  said  to  be  linearly  iso¬ 
morphic  if  a  one-to-one  correspondence  can  be  established  between 
them  with  the  following  conditions  holding  true: 

(a  +  b)'  =  a'  +  b'  (1) 

(a  a)'  =  a  a'  (2) 

The  one-to-one  correspondence  that  satisfies  conditions  (1)  and 
(2)  is  termed  a  linear  isomorphism  between  the  spaces  L  and  L'. 

In  other  words,  in  a  linear  isomorphism  the  image  of  a  sum  is 
equal  to  the  sum  of  the  images,  and  the  image  of  a  product  of  a 
vector  by  a  scalar  is  equal  to  the  product  of  its  image  by  that 
scalar.  The  algebraic  and  geometric  properties  of  linearly  isomor¬ 
phic  spaces  are  absolutely  identical. 

Remark.  A  linear  isomorphism  is  possible  only  if  the  numerical 
factors  in  both  L  and  L'  are  taken  from  one  and  the  same  al¬ 
gebraic  field  (for  example,  both  spaces  L  and  L'  must  be  real  or 
both  complex).  For  instance,  if  L  is  complex  and  L'  is  real,  the 
condition  (2)  cannot  be  fulfilled  because  the  multiplication  by  com¬ 
plex  factors  that  is  admissible  in  L  is  not  defined  in  L'. 

Theorem  1.  For  every  n,  all  n-dimensional  real  spaces  are  li¬ 
nearly  isomorphic  among  themselves. 

Theorem  la.  For  every  n,  all  n-dimensional  complex  spaces  are 
linearly  isomorphic  among  themselves. 

The  proofs  of  Theorem  1  and  Theorem  la  coincide  formally,  the 
sole  difference  being  that  the  numerical  factors  are  taken  from 
different  fields.  Suppose  L  and  L'  arc  both  n-dimensional  and  both 
real  or  both  complex.  We  choose  an  arbitrary  basis  in  each  of 
them:  <?,,  .  .  . ,  en  <=  L\  e\,  . .  . ,  e'n  e=  //.  * 

Let  x  be  an  element  of  L.  Expand  it  in  terms  of  the  basis: 

x  =  x,e,  -f  ...  +  xne„ 


*  Ihf  symbol  i  denotes  the  membership  of  a  given  element  in  a  given  set. 
We  utile  c,  (=  /.  and  read:  "e,  is  an  element  of  L",  or  “e,  is  in  L". 


ISOMORPHISM  BETWEEN  LINEAR  SPACES 


30 


S  10] 

Let  us  now  associate  with  element  x  an  element  x'  e  L'  such 
that 

*'  =  *,<  +  ...  +  xne'n 

This  correspondence  is  one-to-one  due  to  the  theorem  on  the  uni¬ 
queness  of  resolving  a  vector  in  terms  of  the  basis.  Let  us  verify 
the  conditions  of  the  isomorphism: 

(1)  (*  +  //)'  =  (*,  +  ;/,)<  +  ...  +(*,  ,  +  ;/„)< 

=  (X/\  +  +Xnen)  +  {y/l+  •••  +  y/n) 

—  x'  +  y'\ 

(2)  (ox),  =  (ox1)e;+  •••  +  («**„)< 

=  a(vi+  •••  +*,,<)  =  "' 

We  see  that  the  established  correspondence  between  L  and  L' 
satisfies  the  conditions  (1)  and  (2).  This  completes  the  proof  of 
the  theorem. 

Remark.  Actually,  what  has  been  proved  is  that  any  two  linear 
spaces  of  the  same  dimension  over  one  and  the  same  algebraic 
field  are  isomorphic. 

2.  Because  of  the  theorem  just  proved,  all  n-dimensional  real 
linear  spaces  are  isomorphic  to  the  real  coordinate  space  Kn\  all 
n-dimensional  complex  spaces  are  isomorphic  to  the  complex 
space  Kn ■  Thus,  without  any  loss  of  generality,  in  the  theory  of 
n-dimensional  linear  spaces  we  can  confine  ourselves  to  the  study 
of  Kn  spaces. 

3.  Theorem  2.  A  linear  space  isomorphic  to  an  n-dimensional 
space  is  itself  n-dimensional. 

Proof.  Given  an  n-dimensional  linear  space  L.  Let  L'  be  a  space 
isomorphic  to  L.  We  will  first  prove  that  under  a  linear  isomor¬ 
phism  the  image  of  the  zero  element  Oe  L  is  the  zero  element  of 
the  space  L'.  For  this  purpose  we  take  an  arbitrary  element  a'  e  L' 
and  its  original  (preimage)  a  e  L.  Since 


a  —  a  +  0 

it  follows  that 

a'  =  (a  +  0)' 

(3) 

But  by  the  definition  of 

an  isomorphism, 

(a  +  0)'  =  a'  +  0' 

(4) 

From  (3)  and  (4)  it  follows  that  a'  -f-  0'  =  a'.  Therefore,  the 
image  0'  of  element  0  is  the  zero  element  of  the  space  L'. 


-Ill 


LINEAR  SPACES 


[CH,  1 


It  will  now  be  shown  that  if  we  take  an  independent  system  of 
vectors  in  L,  then  their  images  will  be  independent  vectors  in  L'. 

Let  a,  b,  ....  q  be  an  independent  system  in  L.  Consider  the  re¬ 
lation 

aa'  +  f»'+  ...  +  v.q'  =  0'  (5) 

By  the  definition  of  an  isomorphism,  (5)  can  be  rewritten  thus: 

(an  — (—  — (—  ...  -f-  xqY  =  0/ 

And  since  the  preimage  of  the  zero  element  is  the  zero  vector  of  L, 
it  follows  that 

an  -f  pb  +  •  •  •  +  =  9  (6) 

By  virtue  of  the  linear  independence  of  the  vectors  a,  b,  . . . ,  q  in 
L,  it  follows  from  (6)  that 

a  =  p=  ...  =  x  =  0  (7) 

Thus,  from  (5)  follows  (7).  Hence,  the  vectors  a',  b',  ...,  q'  are 
independent  in  L'.  Since  the  space  L  is  n-dimensional,  it  has  n 
linearly  independent  vectors.  Their  images  in  L'  are  also  indepen¬ 
dent.  Hence,  the  dimensionality  of  L'  is  not  less  than  n.  In  this 
discussion,  we  can  interchange  L  and  L'  to  find  that  the  dimen¬ 
sionality  of  L  is  not  less  than  that  of  L'.  Therefore,  L'  has  the  di¬ 
mension  n  and  the  theorem  is  proved. 

Corollary  1.  Finite-dimensional  spaces  of  unlike  dimensions  are 
not  isomorphic. 

Corollary  2.  An  infinite-dimensional  space  is  not  isomorphic  to 
any  finite-dimensional  space. 

§  11.  Correspondence  between  complex  and  real  spaces 

1.  Finite-dimensional  complex  and  real  linear  spaces  stand  in  a 
relation  to  one  another  that  we  will  now  discuss.  We  begin  with 
an  example. 

Geometric  vectors  located  on  a  single  straight  line  form  a  one- 
(limeiisional  real  linear  space.  This  is  because  an  arbitrary  nonzero 
vector  multiplied  by  a  real  number  can  be  transformed  into  any 
other  collinear  vector. 

Geometric  vectors  located  in  a  plane  form  a  two-dimensional 
real  space.  Here,  a  fixed  vector  can  no  longer  be  transformed  into 
any  other  vector  by  multiplication.  The  supply  of  real  factors  is  too 
small  compared  with  the  diversity  of  vectors  making  up  that 
space,  ami  so  two  vectors  may  prove  to  be  linearly  independent. 

I  lie  supply  of  complex  factors  is  twice  as  rich.  Therefore,  multi¬ 
ple  a  I  Ion  of  vectors  by  complex  numbers  may  be  defined  so  that  the 
collet  lion  of  geometric  vectors  in  the  plane  turns  into  a  one-di- 


COMPLEX  AND  REAL  SPACES 


41 


4  HI 

mensional  complex  space.  This  requires  the  possibility,  via  multi¬ 
plication,  of  transforming  any  nonzero  vector  of  the  given  plane 
into  any  other  vector  of  the  same  plane. 

This  problem  can  be  solved  if  we  define  the  product  of  a  geo¬ 
metric  vector  by  a  complex  number  in  the  following  manner. 

Let  a  be  an  arbitrary  vector  in  the  plane.  We  assume  that  it  is 
laid  off  from  the  coordinate  origin.  Now  let  a  =  p(cos q?  +  / sin rp) 
be  a  complex  factor.  Turn  vector  a  around  the  origin  of  coordinates 
through  the  angle  cp  and  then  multiply  it  by  the  real  number  p. 
Denote  the  resulting  vector  by  b  and  set  a  a  =  b.  As  before,  we 
add  vectors  by  the  parallelogram  rule. 

With  that  definition  of  addition  and  multiplication,  all  the 
axioms  of  a  linear  space  hold  true.  To  see  this,  it  suffices  to  note 
that  the  complex  numbers  themselves  are  depicted  by  vectors  in 
the  plane  and  that  here  addition  of  vectors  and  multiplication  of 
a  complex  number  a  by  a  vector  a  are  defined  in  exactly  the  same 
way  as  we  ordinarily  define  addition  of  complex  numbers  and  mul¬ 
tiplication  of  a  complex  number  a  by  a  complex  number  a.  There¬ 
fore,  in  our  case  the  axioms  ( 1 ) -  (8)  hold  true  since  they  hold  for 
complex  numbers.  Now  any  single  nonzero  vector  forms  a  linearly 
independent  system,  and  any  two  vectors  are  linearly  dependent 
(since  multiplication  includes  a  rotation)  so  that  the  resulting 
complex  space  is  one-dimensional. 

2.  We  have  seen  that  a  one-dimensionai  complex  space  and  a 
two-dimensional  real  space  can  be  constructed  out  of  the  same 
objects,  namely,  out  of  vectors  in  the  plane,  with  addition  of 
vectors  being  defined  in  both  cases  identically. 

Multiplication  is  defined  differently,  which  is  unavoidable  since 
the  supplies  of  the  factors  differ.  Note  however  that  multiplication 
by  real  numbers  is  performed  in  the  same  manner  in  these  spaces. 

3.  It  is  easy  to  see  that  the  foregoing  example  is  a  special  case 
of  a  more  general  phenomenon:  with  every  complex  linear  space 
is  associated  a  real  space  of  twice  the  dimensionality  of  the  com¬ 
plex  space;  also  note  that  though  the  correspondence  is  not  an 
isomorphism,  it  very  much  resembles  an  isomorphism.  Namely,  the 
following  theorem  holds. 

Theorem.  A  complex  linear  space  C„  of  dimension  n  may  be 
mapped  one-to-one  onto  a  real  linear  space  Lin  of  dimension  2 n  so 
that  the  condition 

(a  +  b)'  =  a'  +  b'  (I) 

holds,  and  for  real  factors  X  the  following  condition  holds: 

(Xa)'  =  Xa' 


(2) 


LINEAR  SPACES 


[Cl I.  I 


Remark.  As  in  Section  10,  the  prime  indicates  the  image  in  L2„ 
of  an  element  of  C„. 

Proof.  According  to  Section  10,  all  n-dimensional  complex  spa¬ 
ces  are  isomorphic.  We  can  therefore  confine  ourselves  to  any  one 
of  (hem.  For  the  C„  space  let  us  take  a  complex  coordinate  space. 
Let 

«={*!  +  iX 2,  *3  +  . *2/1-1  +  ix 2,,} 

be  any  element  in  C„.  With  this  element  we  pair  the  element 

o!  =  {*!,  *2-  *3-  *4.  •••■  *2n-l.  %.} 

taken  from  the  real  coordinate  space  which  plays  the  role  of  L2n. 
Since  the  decomposition  of  a  complex  number  into  the  real  part 
and  the  imaginary  part  is  performed  in  unique  fashion,  the  corres¬ 
pondence  established  between  Cn  and  L2n  is  one-to-one.  The  truth 
of  (1)  and  (2)  for  real  K  is  obvious. 

Remark.  For  n  =  1  we  have 

a  =  {x  +  iy),  a'  =  {x,y) 

which  returns  us  to  the  original  example. 

4.  In  the  sequel,  we  will  assume  all  geometric  vectors  to  be  ele¬ 
ments  of  real  space. 

§  12.  Linear  subspace 

1.  Let  £  be  a  linear  space  and  £  a  certain  set  of  elements  in  £. 
Definition.  The  set  £  in  the  space  L  is  called  a  linear  subspace 

if  the  following  conditions  hold  true: 

(1)  for  any  x,  y  in  £  their  sum  x  -\-  y  also  lies  in  £; 

(2)  for  any  x  e  £  and  any  scalar  a,  the  product  ax  e  £. 
Remark.  For  the  sake  of  brevity  we  will  often  say  subspace  in¬ 
stead  of  linear  subspace. 

2.  Let  £  be  a  linear  subspace  of  £.  The  operations  of  addition 
of  vectors  and  their  multiplication  by  scalars  given  in  £  will  be 
considered  relative  only  to  those  elements  that  enter  into  £.  Then 
the  following  theorem  holds  true. 

Theorem  1.  Every  linear  subspace  £  of  a  linear  space  L  is  itself 
a  linear  space. 

Proof.  By  the  definition  of  a  subspace,  addition  and  scalar  mul¬ 
tiplication  are  closed  operations  in  £.  The  axioms  (l)-(2)  and 
(5) - (8)  of  a  linear  space  are  certainly  fulfilled  in  £  since  in  ge¬ 
neral  they  hold  true  for  all  elements  of  £.  Therefore,  the  proof  only 
requires  the  verification  of  axioms  (3)  and  (4),  that  is,  we  have  to 


LINEAR  SUBSPACE 


43 


S  L’l 

establish  that  togetlier  witli  every  element  .v  of  L  the  subspace  L 
includes  the  additive  inverse  element  — x  and  that  0et*. 

By  the  second  condition  in  the  definition  of  a  subspace  we  have 

— -v  ==  ( — 1)  •  .ve=L 
Using  the  first  condition,  we  get 

0  ==  .v  +  (—x)  e  L 
which  completes  the  proof. 

3.  The  intersection  of  a  collection  of  sets  is  the  collection  of 
those  elements  that  belong  simultaneously  to  all  the  sets  under 
consideration.  The  intersection  of  two  sets  and  $  is  denoted  by 
the  symbol  s4-  fl  93.  This  notation  will  be  used  frequently  in  the 
sequel. 

Theorem  2.  The  intersection  of  any  collection  of  subspaces  of  a 
l liven  linear  space  L  is  also  a  linear  subspace. 

Proof.  For  the  sake  of  simplicity,  the  proof  will  be  carried  out 
for  the  case  of  two  subspaces  Lx  and  L2.  Let  L3  —  L\  (1  L2,  and 
let  the  vectors  x,  y  lie  in  L3.  When  regarding  x,  y  as  elements  of 
L\,  we  find,  by  the  definition  of  a  subspace,  that  x  -\-  y  <=  L t, 
ax  e  Z-i  (a  an  arbitrary  scalar).  In  exactly  the  same  way, 
x  -f  y  e  Z-2,  «el2.  But  this  means  that  x  y  e  L3,  aiei3 
and  therefore  L3  satisfies  the  definition  of  a  subspace.  This  com¬ 
pletes  the  proof  of  Theorem  2. 

4.  Examples  of  subspaces.  (1)  The  set  L  consisting  of  the  single 
zero  element  0  of  a  given  space  constitutes  a  subspace,  for 

0  -{-  0  =  0  C":  /. ,  aO  =  0  Cr’z  L. 

(2)  In  the  n2-dimensional  space  of  square  n  X  n  matrices,  the 
set  of  symmetric  matrices  ||alfe||,  that  is,  such  that  aik  —  aui,  forms 
a  subspace. 

The  set  of  skew-symmetric  matrices,  that  is,  such  that  a,*  = 
=  —  a*,-,  also  forms  a  subspace  in  the  space  of  n  X  n  matrices. 

(3)  In  the  space  of  all  possible  functions  specified  on  the  in¬ 
terval  xi  ^  t  ^  t2,  each  of  the  following  sets  forms  a  linear  sub¬ 
space: 

(a)  the  functions  continuous  at  some  interior  point  t0  of  the 
interval  t|  <  t  <  T2; 

(b)  the  functions  continuous  in  the  interval  ti  <  t  <  r2; 

(c)  the  functions  continuous  on  the  entire  interval  [ti,  T2]; 

(d)  the  functions  continuous  on  the  interval  [ti,  T2]  together 
with  their  derivatives  up  to  order  N  inclusive,  where  N  is  an  ar- 
bilrary  positive  integer; 


41 


LINEAR  SPACES 


[CM.  1 


(o)  the  functions  having  derivatives  of  all  orders  on  the  interval 
Iti,  t2]; 

(f)  all  polynomials  considered  on  the  interval  [tt,  t2]; 

(g)  polynomials  of  degree  not  exceeding  a  fixed  integer  N  >  0. 
Each  of  the  subspaces  enumerated  in  Example  (3)  is  contained 

in  the  preceding  one,  and  all  of  them,  with  the  exception  of  the 
last  one,  are  infinite-dimensional  (the  last  one  is  of  dimension 

tf+1); 

(h)  on  the  interval  [ti,t2]  fix  an  arbitrary  set  of  points  si.  The 
functions  equal  to  zero  at  points  of  the  set  si  also  form  a  subspace. 
By  way  of  an  exercise  we  leave  it  to  the  reader  to  figure  out  how 
the  dimension  of  this  subspace  depends  on  the  choice  of  the  set  si. 

(4)  Let  L  be  the  three-dimensional  space  of  geometric  vectors 
in  ordinary  Euclidean  space.  We  assume  the  vectors  to  issue  from 
the  coordinate  origin.  Let  us  consider  all  the  vectors  located  in 
some  plane  passing  through  the  origin.  These  vectors  form  a  sub¬ 
space. 

It  is  left  to  the  reader  to  prove  that  the  above-mentioned  sub¬ 
spaces  do  indeed  satisfy  the  definition  of  Subsection  1. 

5.  The  following  are  instances  of  subsets  of  a  linear  space  that 
are  not  subspaces. 

(a)  In  the  three-dimensional  space  of  geometric  vectors  we  con¬ 
sider  the  collection  of  vectors  whose  termini  lie  in  a  fixed  plane 
not  passing  through  the  coordinate  origin.  They  do  not  form  a 
subspace  since  both  the  sum  of  two  vectors  and  the  product  of  a 
vector  by  any  scalar  =#=  1  are  not  members  of  this  subset. 

(b)  In  the  same  space  we  consider  vectors  whose  termini  lie  on 
the  surface  of  a  cone  with  vertex  at  the  coordinate  origin.  The  pro¬ 
duct  of  any  vector  of  this  set  by  any  scalar  is  also  a  member  of 
this  set.  Nevertheless,  the  indicated  set  is  not  a  subspace  since, 
generally  speaking,  it  is  not  closed  under  the  operation  of  addi¬ 
tion. 

§  13.  Linear  hull 

I.  Suppose  in  a  linear  space  L  we  have  a  system  of  vectors 

«i . au. 

Definition.  The  set  of  all  linear  combinations  of  the  form 
x  =  a1al-f-  ...  •+•  akOk 

is  called  a  linear  hull  of  the  given  system  and  is  denoted  by  the 
symbol  L(a i,  . .  .  ,  ah) . 

We  sometimes  say  that  L(a[,  . . . ,  ah)  is  a  linear  hull  spanned  by 
the  vectors  «i,  .  .  . , 


L1NRAR  HULL 


45 


4  ill 

Theorem  I.  The  linear  hull  of  any  system  of  vectors  is  a  linear 
subspace  of  the  space  L. 

Proof.  Let  us  take  arbitrary  vectors  x  and  y  from  the  linear 
hull  L  (au  ....  ak ): 

*  =  a|a,+  ...  +akak<=L(au  ak), 

U  =  Piai  +  •••  ...,  ak) 

Then 

x  +  y  —  (a,  -(-  p,)  a,  -f-  •••  +  (ctfe  +  PaW  s  ^  («[,  Uk) 
Besides,  for  any  scalar  A,  we  have 

Ajc  =  (A,a|)a,  +  ...  +  (Act*)  ak  e  L  (a,,  ...,  ak) 

2.  Remark.  The  linear  hull  L  (ai,  ...,  ak)  may  coincide  with 
the  entire  space  L  (for  example,  if  au  . . . ,  ah  is  a  basis  in  L). 

3.  Theorem  2.  If  every  vector  of  a  system  c, . cm  is  linearly 

expressible  in  terms  of  the  vectors  of  the  system  a\, ...  ,ak,  then 

L(c . .  cm)czL(alt  ....  ak)  * 

Proof.  Let  igL  (c ,,  ...,  c„, );  that  is,  suppose  x.  can  be  ex¬ 
pressed  in  terms  of  cu  . . . ,  c,„.  Then,  by  Property  (6),  Subsection  2, 
Section  4,  the  vector  x  can  be  expressed  in  terms  of  a i,  , . . ,  a*. 
Hence  x  e  L  (alt  . . . ,  ak).  Thus 

L  {c ,,  . . . ,  cm)  cz  L  (#|,  ...»  ak) 

Corollary.  The  linear  hull  of  any  subsystem  of  a  given  system 
of  vectors  is  included  in  the  linear  hull  of  the  entire  given  system. 

Theorem  3.  If  the  system  alt  ...,  ak  has  rank  r  >  0,  then  any 
one  of  its  linearly  independent  subsystems  consisting  of  r  vectors 
is  a  basis  in  the  linear  hull  L  (a . . ak). 

Proof.  The  system  a\,  ....  ah  of  rank  r  (r  >  0)  has  a  linearly 
independent  subsystem  consisting  of  r  vectors. 

For  the  sake  of  definiteness,  suppose  that  the  first  r  vectors  of 
the  given  system  are  linearly  independent.  Then,  since  the  rank  of 
the  system  a\,  . . . ,  ah  is  equal  to  r,  it  follows  that  each  of  the 
vectors  at,  . . . ,  ak  is  linearly  expressible  in  terms  of  the  vectors 
a . . aT.  From  this  and  by  Theorem  2, 

L(a<,  . . .,  ar,  ar+t,  ....  ak)  cz  L(alt  ....  ar) 

On  the  other  hand,  by  the  Corollary  to  Theorem  2, 

L.  {a  i  y  . . . ,  ar)  cz  L  {a j ,  . . . ,  art  ar+.  t ,  .  • . ,  ak) 


*  The  symbol  cr  indicates  the  inclusion  of  the  first  of  the  sets  in  the  second. 
.'Y  c3S  is  read  as  is  contained  in  (it  is  not  precluded  that  st-  may  coin¬ 
cide  with  3S). 


LINEAR  SPACES 


[Cll.  I 


4f. 


Hence  L(ax . ar,  a,+i, . . . ,  an)  coincides  willi  L(ax . ur).  And 

so  every  element  x  in  L(at . an)  can  be  resolved  in  terms  of 

ai,  ....  ar.  For  this  reason  and  because  of  the  independence  of  the 
vectors  ax,  ....  ar,  they  constitute  a  basis  in  L(ax,  . . . ,  ah). 

Theorem  4.  If  the  rank  of  the  system  a i,  ah  is  equal  to  r, 
then  L(ax,  . . . ,  an)  is  an  r-dimensional  subspace. 


Fig.  1 


Proof.  Suppose  that  r  >  0.  Then,  by  the  preceding  theorem, 
there  is  a  basis  in  L(ax,  . . . ,  an)  consisting  of  r  elements,  whence 
and  by  the  theorem  of  Subsection  4,  Section  8,  L(ax,  . . . ,  an)  is  of 


Fig.  2 


dimension  r.  Suppose  r  =  0.  Then  ax  —  . . .  =  ah  —  0.  But  then 
L(a . .  au)  includes  only  0  and,  hence,  has  the  dimension  0. 


4.  Let  us  consider  some  examples.  (1)  Let  a,  b,  c  (a  =t=  0)  be 
geometric  vectors  lying  on  a  single  straight  line.  Then  L(a,  b,  c)  = 

=  M«)  (Fig-  •)• 

Here,  L(a)  is  a  one-dimensional  subspace  consisting  of  all  vec¬ 
tors  lying  on  the  given  straight  line.  In  this  subspace,  the  vector  a 
constitutes  a  basis. 

(2)  Let  a,  b,  c  be  geometric  vectors  with  a  and  b  not  collinear, 
<  —  a  T  b.  In  that  case  L(a,  b,  c)=  L(a,  b)  (Fig.  2)  so  that  an 
arbitrary  vector  xc^L(a,b,c)  can  be  represented  as  x  =  aa  -f- 
+  P". 


SUM  OI-  SUBSPACES.  DIRECT  SUM 


47 


*  HI 

Here,  L(a,  b)  is  a  two-dimensional  subspace  that  consists  of  all 
vectors  coplanar  with  the  vectors  a,  b.  The  vectors  a,  b  constitute 
a  basis  in  L(a,  b). 

(3)  Let  the  functions  1,  t,  t2 . xN  be  regarded  as  elements 

of  a  linear  space  £  of  continuous  functions  specified  on  the  interval 
In,  T2I  Then  £(1,  r,  t2,  . . . ,  t^)  is  a  subspace  of  £  consisting  of  all 
polynomials  of  degree  not  higher  than  N. 

5.  In  conclusion  we  note  an  obvious  but  important  proposition: 
nny  subspace  of  a  finite-dimensional  space  is  a  linear  hull  of  some 
system  of  elements. 

To  prove  this,  it  suffices  to  note  that  any  subspace  £  of  an  rc-di- 
mcnsional  space  l.  is  finite-dimensional  (and  has  dimension  ^  n\ 
this  is  clear  since  the  maximum  number  of  linearly  independent 
elements  in  £  cannot  exceed  those  in  £).  But  then  either  £  is  zero¬ 
dimensional,  and  then  £  =  £(0),  or  £  has  the  basis  qx,  ....  <7,, 
and  then  £  =  L(qx,  . . . ,  qr). 

In  the  last  case,  the  subspace  £  consists  of  vectors  (and  only 
such  vectors)  that  have  the  form 

x  =  t\<h  +  ...  +trqr  (1) 

where  t\,  .  ■  ■ ,  tr  are  arbitrary  scalars. 

Suppose  in  the  space  £  we  have  a  basis  ex,  . . . ,  e„,  and  x  = 
=  xxex  +  •  •  •  +  xnen,  q,  =  qXiex  +  q2ie2+ . ..  +  qnien  (here  1  = 

—  1 . r\  x  1,  ....  are  the  components  of  the  vector  x\  and 

</ii,  q2i . qni  are  the  components  of  the  vector  qi). 

Then  formula  (1)  may  be  replaced  by  the  relations  in  compo¬ 
nents: 


x\  —  qwU  +  q\2t2  +  • 

■  +  q\rtn  | 

xn  —  qn\t\  +  q„2t2  +  • 

•  +  q,ldr  J 

Relations  of  the  form  (2)  are  called  the  parametric  equations  of 
Hie  subspace  L(q  1,  . . . ,  qr). 

§  14.  Sum  of  subspaces.  Direct  sum 

I.  In  a  linear  space  £  let  there  be  given  two  linear  subspaces 
1 1  and  L2.  We  denote  by  £  the  set  of  all  vectors  x  that  can  be  re¬ 
presented  in  the  form 

X  =  X|  +  x2 

where  xx  e  L\,  x2  e  L2  (Fig.  3).  It  is  readily  seen  that  £  is  a  li¬ 
near  subspace  of  £.  Indeed,  together  with  x  e  £  take  another  vec¬ 
tor  x'  e  £,  that  is, 


48 


LINEAR  SPACES 


[Cl  I,  I 


where  x\g.Lv  <ei2;  then  the  vector 

x  +  x'  =  (xl  +  x'l)  +  (x2  +  x'2) 
belongs  to  L  since  xl  +  x\^Li,  x2-\-  x'e  Lr  Besides, 
cue  =  cue,  +  a.v2  s  L  since  a jc,e£,,  ax2  e  £2 

The  subspace  £  is  called  the  sum  of  the  subspaces  L\  and  £2  and 
is  denoted  thus:  £  =  £j  +  £2.  Fig.  3  depicts  a  special  case  where 
£  =  L  is  three-dimensional  and  £1  and  £2  are  two-dimensional. 


2.  The  notion  of  a  sum  of  subspaces  carries  over  directly  to  any 
number  of  terms.  Given  in  space  £  the  subspaces  L\,  . . . ,  Lh\  their 
sum 

£  =  £,  +  £2+  ...  -(-  Lk 

is  then  a  linear  subspace  consisting  of  all  vectors  of  the  form 

jc  =  JC|  -f-  ...  +  Xk  where  x,^Lh  ...,  x*s£a  (1) 

3.  Definition.  If  for  every  xe£  the  resolution  (1)  is  unique, 
then  £  is  called  the  direct  sum  of  the  subspaces  L\,  ■■■,  L;i. 


For  the  direct  sum  we  use  the  symbol  ©,  for  example, 

£  =  £i©£2©  ...  0£ft 

We  will  use  the  symbol  ©  in  cases  where  it  is  necessary  to  stress 
that  we  are  dealing  with  a  direct  sum. 

By  way  of  an  illustration.  Fig.  4  depicts  the  direct  sum  of  the 
one-dimensional  subspaces  £i  and  £2.  Note  that  the  sum  L\  -f-  £2 
in  Fig.  3  is  not  a  direct  sum. 


S  14)  SUM  OF  SUBSPACES.  DIRECT  SUM  49 

In  the  next  two  subsections  we  give  the  conditions  that  are 
necessary  and  sufficient  for  a  sum  of  subspaces  to  be  a  direct 
sum. 

4.  Theorem  1.  The  sum  L  =  L\  -f-  . . .  +  Lh  is  a  direct  sum  of 
the  subs  paces  Lt,  . . . ,  Lh  if  and  only  if  not  one  of  the  sub  spaces 
L\,  . . . ,  Lh  has  any  elements,  except  0,  in  common  with  the  sum  of 
the  remaining  elements. 

Proof.  (1)  Suppose  it  is  given  that  the  intersection  of  each  of  the 
subspaces  under  consideration  with  the  sum  of  the  remaining  ones 
consists  solely  of  the  zero  vector  0.  We  will  prove  that  £  = 
=  Li  ®  . . .  ®  Li,.  Suppose  that  there  are  two  resolutions  of  the 
vector  x  <=  £: 

x  =  Xi  -f  X2  +  .  .  .  +  Xk,  x  =  Xt  +  X2  +  .  .  .  +  Xk  (2) 

where  xj  e  Lj,  Xj  e  Lj.  It  is  required  to  verify  that 

*/  =  */  (3) 

for  each  of  the  numbers  /.  From  (2)  we  have 

0  =  (*i  —  -V|)  -F-  (^2  —  -v2)  +  ...  +  (xk  —  xk)  (4) 

Set  t/i  =  —  (x,  —  X|).  Then  yx  =  x,  —  x,  e  L,,  y,  =(x2  —  x2)+  . .. 
. . .  +0/4  —  X;,)e  L2  +  . . .  +  Lh  and  therefore  y i  =  8,  that  is, 
x,  =  x,. 

Now,  introducing  yi  =  — ( x2  —  *2)  and  taking  advantage  of  (4) 
and  also  of  the  fact  that  L2  ft  (L|  -f-  L3  +  . . .  +  Lh)  =  0,  we  get 
x2  =  x2. 

The  remaining  equations  of  (3)  are  proved  similarly. 

(2)  Suppose  that  for  any  x  e  £  the  resolution  (I)  is  unique. 
We  will  demonstrate  that,  for  example,  L|  does  not  have  any  com¬ 
mon  elements  with  L2  +  . . .  +  Lh  except  8.  Suppose  the  opposite, 
that  is,  that  there  is  a  z  0  such  that  zei.i,zeL2+...  +  Ln. 
But  then  z  —  z2  +  . . .  +  zh,  where  z2  e  L2,  ....  zh  e  Lk.  We  can 
therefore  write 

9  =  z  +  ( — 1)22+  ...  +  ( — 1)2* 
where  zeL,  22eL2,  . . . ,  zu<=Lh.  On  the  other  hand, 

0  =  0  +  0®  ...  +0 

Thus,  for  0  e  £  we  have  obtained  two  distinct  resolutions  of  the 
form  (1),  which  is  a  contradiction  of  the  hypothesis. 

5.  Theorem  2.  £  =  L\  +  . . .  -f  Lh  is  a  direct  sum  of  the  sub¬ 
spaces  Lf,  if  and  only  if  every  system  of  nonzero  vectors 

a . .  a  1,  taken  one  at  a  time  from  each  L;  (i.e.,  Lj,  j  = 

—  !,...,£)  is  linearly  independent. 


f.0  LINEAR  SPACES  [CH.  I 

Proof.  (1)  Let  L  be  the  direct  sum  of  the  subspaces  Lu  L 
We  take  arbitrary  nonzero  vectors  au  ....  one  at  a  time  from 
each  Lj  (fljeLj).  We  will  prove  that  they  are  linearly  indepen¬ 
dent.  Suppose  that  the  system  a\,  ....  a*  is  dependent.  Then  one  of 
these  vectors  is  linearly  expressible  in  terms  of  the  others.  For  the 
sake  of  definiteness  we  assume  that  ai  is  linearly  expressible  in 
terms  of  02,  ....  a*.  But  then  this  vector  belongs  to  L\  and  to  the 
sum  L2  +  . . .  +  Lk,  which  runs  counter  to  Theorem  1. 

(2)  Suppose  any  system  of  nonzero  vectors  0|,  ....  a*  taken 
respectively  from  L\,  . . . ,  Lh  is  linearly  independent.  We  will  prove 
that  L  =  L|  +  ...-(-  Li,  is  a  direct  sum.  Assume  the  contrary.  Then, 
by  Theorem  1,  one  of  the  subspaces  Lu  . . . ,  Lh  has  a  nonzero  vec¬ 
tor  in  common  with  the  sum  of  the  remaining  ones.  For  example, 
suppose  the  nonzero  vector  at  belongs  to  L\  and  L2  Lh. 

Then  a,  =  a'-F  +  a'(a'e/,2 . n[eLt).  In  place  of  this 

relation  we  can  write  a.\  =  e2a2  +  . . .  +  e^a*,  taking  er  =  0  in  the 
case  a\  =  Q,  and,  in  this  case,  taking  for  at  any  vector  from  Lt  so 
long  as  it  is  not  equal  to  0;  but  if  a\  #  0,  then  we  assume  that 
et  —  1  and  a,  —  a'. 

Thus  are  indicated  the  nonzero  vectors  ai,  . . . ,  au  (at  e  Lf)  which 
are  connected  by  a  linear  dependence.  The  result  is  a  contradiction 
with  the  hypothesis  of  the  theorem. 

6.  If  the  space  L  itself  is  resolved  into  the  direct  sum  of  its 
subspaces  Lt,  . . . ,  Lh,  then  each  vector  x  is  uniquely  resolved  into 
its  components  x[t  . . . ,  xh  lying  respectively  in  L\,  . . . ,  Lk. 

In  particular,  if  eit  . . . ,  en  is  a  basis  in  L,  then  L  can  be  re¬ 
solved  into  the  direct  sum  of  one-dimensional  subspaces:  L  = 
=  L\  ©  ■  ■  •  ©  L„,  where  Lt  is  the  linear  hull  of  the  basis  vector  e; 
(that  is,  Li  consists  of  vectors  obtained  by  multiplying  et  into  all 
possible  scalars). 

7.  Theorem  3.  Given  in  a  linear  space  L  the  subspaces  Lh  and 
Li  of  dimension  k  and  I  respectively.  If  tlieir  intersection  is  of  di¬ 
mension  m,  then  the  dimension  of  their  sum  Lh  -f-  Lt  is  equal  to 
r  =  k  -\- 1  -  -  m.* 

To  prove  Theorem  3  we  will  need  the  following  lemma. 

Lemma.  In  n-dimensional  space,  any  independent  system  of  vec¬ 
tors  less  than  n  may  be  completed  to  constitute  a  basis. 

Proof  of  lemma.  Let  eu  ....  eh  be  an  independent  system  of 
vectors,  k  <  n.  There  will  be  at  least  one  vector  eh+\  such  that 
eh  ....  eh,  eh+l  is  also  an  independent  system.  If  there  were  no 
such  vector  eh+\,  then  any  vector  of  L  could  be  expressed  in  terms 
of  <■) . <■/,,  hut  this  contradicts  the  hypothesis  k  <  n. 


In  particular,  foi  m  —  0,  the  sum  Lk  +  Lt  will  be  a  direct  sum. 


SUM  OF  SUBSPACES.  DIRECT  SUM 


51 


$  H] 

If  k  +  1  <  n,  the  above  argument  may  be  repealed  and  it  is 
possible  to  adjoin  one  more  vector  to  the  system  without  disrupt¬ 
ing  the  linear  independence.  This  procedure  may  be  continued 
until  the  number  of  vectors  in  the  system  reaches  n;  then  it  will 
turn  into  a  basis.  The  proof  of  the  lemma  is  complete. 

Proof  of  Theorem  3.  Pul  Lm  =  Lh  fl  Lt  and  choose  in  the  sub¬ 
space  Lm  a  basis  e\,  . . . ,  e,„.  Using  the  lemma,  complete  the  basis 
to  the  status  of  the  bases  in  subspaces  Li,  and  Lt\ 

*  •  •  •  >  ^m>  hi*  •  *  •  >  ^ k  basis  in  7.^, 

ev  •••-  em-  e'm+ 1.  basis  in  Lt 


Comparing  the  definition  of  a  sum 
nition  of  a  linear  hull  (see  Section  13), 

of  subspaces  with  the 
we  find  that 

defi- 

=  L  (ev  ...,  em,  em+v  . 

•  •  ’  ek’  cm+v  •  •  ■  >  ei ) 

(5) 

We  will  prove  that  the  vectors 

ei>  •  •  •  ’  em'  em+ 1’  *  ■  •  >  ek 

’  em+ 1>  •  •  •  ’  ei 

(6) 

are  linearly  independent.  Assume  the  contrary.  Suppose 
exists  the  nontrivial  relation 

there 

a^l  “f"  •  •  •  "4"  amem  4“  O/n+l^m  +  1 

-f  ...  +<V?A  +  aA+iem+i  4-  ...  4-are;  =  0 

(7) 

Among  the  numbers  oca+i,  . . . ,  ar  there  are  those  that  differ  from 
zero,  otherwise  the  vectors  e,,  ....  em,  em+i,  ....  eh  that  form  a 
basis  in  La  would  be  linearly  dependent.  Set 

ak+\e'm+ 1+  •••  +  are't  =  x¥*B  (8) 

From  (8)  it  follows  that  x  e  L,,  and  from  (7),  that  x  e  Lft; 
therefore  re  /./  f 1  Lh.  Hence,  x  is  linearly  expressible  in  terms  of 
the  vectors  eh  . . . ,  em.  Thus  we  have  a  relation  of  the  form 

ak+le'm+l+  •••  +Vr  =  P  Iei+  •••  +P mem  (9) 

Equation  (9)  signifies  a  nontrivial  linear  dependence  among  the 
vectors  ev  ...,  em,  e’m+v  ...,  e\,  which  is  impossible  since  the 
indicated  vectors  form  a  basis  in  L,.  This  is  a  contradiction,  which 
completes  the  proof  of  the  independence  of  the  vectors  (6).  Now, 
relation  (5)  signifies  that  the  vectors  (6)  form  a  basis  in  Lh  -f-  Lt. 
Hence,  the  dimension  of  Lh  +  Li  is  equal  to  the  number  of  vectors 
in  the  system  (6),  that  is,  to  the  number  r  =  k  +  l  —  m.  The  proof 
of  Theorem  3  is  complete. 

8.  Theorem  4.  The  dimension  of  a  direct  sum  of  subspaces  is 
equal  to  the  sum  of  the  dimensions  of  the  summands.  The  union  of 


LINEAR  SPACES 


[CH.  I 


r>2 


any  bases  taken  one  at  a  time  in  each  summand  forms  a  basis  in 
the  direct  sum. 

Proof.  If  there  are  two  summands,  then  the  first  assertion  of 
Theorem  4  follows  from  Theorem  3  with  account  taken  of  Theo¬ 
rem  1,  as  a  consequence  of  which  m  =  0. 

The  second  assertion  of  Theorem  4  in  the  case  of  two  summands 
follows  from  the  proof  of  Theorem  3,  more  precisely,  from  the  fact 
that  the  system  of  vectors  (6)  is  independent  (now  in  the  notation 
of  (6)  we  have  to  put  m  = 0  and  strike  out  the  vectors  e\,  ....  em). 

Further  note  that  if  L{  L2  L3  Lp  is  a  direct  sum, 

then  L\  -f-  L2,  L\  -f-  L2  +  £.3  ==  (L\  +  L2)  -f-  L3,  etc.,  are  also  direct 
sums.  Therefore,  in  the  general  case,  both  assertions  of  Theorem  4 
are  proved  by  induction. 


9.  Finally,  we  note  the  following  associative  property  for  direct 
sums:  if 


then 


L  —  Lj  0  L, 

(I) 

L  =  L2  0  . . . 

®Lk 

(II) 

=  L|  0  /-2© 

...®Lk 

(III) 

Proof.  Let  x  be  any  element  in  L.  We  have  x  =X\+  x,  where 
Xi  e  L|,  x  e  £.  Since  x  e  £,  it  follows  that  x  —  x2  -f  . . .  +  xh, 
where  x2  e  L2,  . . . ,  xh  e  Lh.  Thus,  for  every  x  in  L  we  have 

X  —  X\  +  x2  +  . . .  +  xk  (*) 

where  JtjeL,-.  Conversely,  from  (•*)  it  follows  that  ; teL  We  will 
prove  the  uniqueness  of  (*).  Let 

x  =  x\  +  x'  +  ...  +x'k,  x'  e=  Lt 

whence  x  =  x\  +  x'.  Here  x'  —  x^-j-  ...  +  x'k.  Consequently,  i'ei, 
Therefore  and  by  the  definition  (I)  we  get  x\  =  xv  x'  —  x.  From 
the  last  equation  and  by  definition  (II)  we  find  x'l=xi  for 
i=l,  2,  ....  k  which  proves  (III). 

This  property  has  to  be  used  when  the  resolution  of  a  space  (or 
subspace)  L  into  a  direct  sum  is  performed  in  succession:  L  = 
=  L|  0  L\  L’  —  L2  0  L",  and  so  on  (see  for  example  Chapter  VII, 
Section  10). 


Chapter  II 


LINEAR  TRANSFORMATIONS 
OF  VARIABLES. 
TRANSFORMATIONS  OF  COORDINATES 


§  1.  Abbreviated  notation  for  summation 

1.  In  the  sequel  we  will  often  have  to  do  with  sums  in  which  the 
summands  are  denoted  by  a  single  indexed  letter;  for  example, 
am  -f-  am+ 1  -+*•••  +  °n ■  In  such  cases  it  is  convenient  to  use  the 
following  abbreviated  notation  for  the  sum: 

N 

am  +  flm  + 1+  •  •  •  +  aN  =  X  al  —  Y  ai 

i  =m  m  ^  /V 

(read:  “the  sum  of  a,-  from  i  =  in  to  N"). 

2.  Properties  of  the  summation  symbol. 

N 

(1)  Ya  —  Na,  since  there  are  N  identical  terms  equal  to  a  for 
1  =  1 

every  i. 

(2)  A  common  factor  can  be  taken  outside  the  summation  sym¬ 
bol: 

N  N 

X  Cfl,  =  C  X  fl; 

i=m  i—rn 

N  N  AT 

(3)  Y  (°i  +  M=  Y  ai~\~  Y  hi 

i  =rn  i =m  i  =m 

(4)  The  magnitude  of  the  sum  does  not  depend  on  the  letter 
used  for  the  summation  index: 

N 

Y  a/  =  flm  +  «m+ 1+  •••  +«V-l+tf/V 

i=m 

N  N 

=  Z  «/==  Y  “k 

J—m  k=m 


5-1 


INPAR  TRANSFORMATIONS  OF  VARIABLES 


[CH.  II 


(5)  If  the  summation  is  over  two  distinct  indices,  each  of  which 
varies  independently  of  the  other,  the  order  in  which  they  are 
summed  is  immaterial: 


z 


m,  <  /  <  V, 


«./=  Z  I  L  a„\ 

i=m i  \l=m,  / 


When  summing  over  different  indices,  one  ordinarily  drops  the 

A h  /  N,  \  N  i  AT. 

parentheses  and  in  place  of  X  (  Z  °//  )  one  writes  Z  XI  an- 

It  is  assumed  that  the  terms  a,-;  are  first  summed  with  respect  to  / 
with  i  held  constant  (inner  sum),  and  then  the  resulting  quantities 
are  summed  with  respect  to  i  (outer  sum).  The  fifth  property  car¬ 
ries  over  to  the  case  of  summation  with  respect  to  three  or  more 
distinct  indices. 

Note  that  if  the  range  of  one  index  depends  on  another  summa¬ 
tion  index,  then  upon  any  change  in  the  order  of  summation  the 
range  of  each  of  the  indices  is,  generally  speaking,  different.  In 
particular, 


n  n 

E  E  aa  = 

i=i  i=i 


Z 


an : 


n  / 

EZ 

1=1  i=l 


an 


(6)  When  summing  over  two  (or  several)  indices,  a  factor  that 
does  not  depend  on  the  index  of  inner  summation  may  be  taken 
outside  the  sign  of  the  inner  sum: 

JV,  N,  N,  AT, 

Z  Z  Z  bi  Z  an 

l—m i  /=m,  i=m ,  /=m, 

The  aforementioned  properties  of  the  sigma  (summation)  symbol 
follow  directly  from  the  rules  of  arithmetic  operations  and  are 
made  frequent  use  of  in  the  sequel. 


3.  To  abbreviate  notation,  let  us  agree  that  if  the  range  of  a 
summation  index  is  not  indicated,  it  is  assumed  that  the  summa¬ 
tion  is  from  I  to  n\  for  example, 

Z  —  Z  at 

i  i= i 

Besides,  if  the  summation  from  1  to  n  is  over  several  indices  that 
are  independent  of  one  another,  we  will  write  one  summation 
symbol  and  under  it  all  the  indices  over  which  the  summation  is 
to  be  extended.  That  is, 

n  n  n 

Z  aijki  —  Z  Z  Z  a nk i 

i.  I.  k  i  —  I  1=1  A=l 


ABBREVIATED  NOTATION  FOR  SUMMATION 


*  II 


The  point  is  that  later  on  we  will  very  often  have  to  do  with 
summing  from  1  to  n,  where  n  is  the  dimension  of  a  space. 

Let  us  further  agree  that  if  the  summation  indices  under  the 
summation  symbol  are  not  indicated  at  all,  this  means  that  the 
summation  is  to  be  carried  out  with  respect  to  all  the  indices  that 
appear  twice  under  the  summation  symbol,  and  the  summation  is 
from  1  to  n  with  respect  to  each  of  these  indices.  For  example,, 

£  atbi  =  afii  +  afa  +  •••  +a„bn; 

n  n  n  n 

Z  AfitfX,  =  £  £  AiOijXf  =  £  A(  £  a„K, 

i=l  /=  I  i  =  I  /=! 


Note  the  inner  sum  in  the  right  member  of  the  last  equation.  The 
general  term  of  this  sum  depends  on  two  indices,  but  the  summa¬ 
tion  is  carried  out  only  with  respect  to  one  of  them  (the  index  /), 
so  that  the  result  of  the  sum  depends  on  the  other  index  (the 


number  i).  Putting  //,  =  £  anxi  and  taking  advantage  of  the  ab¬ 
breviated  notations,  we  can  write 


Ui  =  £  anxi 


The  indices  with  respect  to  which  the  summation  is  performed 
are  often  called  dummy  indices  (or  umbral  indices).  By  Property 
(4),  Subsection  2,  a  dummy  index  may  be  changed  in  the  course 
of  a  computation,  as  for  example, 

Vi  =  £ atix,  —  £ ctiaxa,  <=1,...,  n  (1)> 


Indices  over  which  summation  is  not  performed  are  ordinarily 
called  free  indices.  It  is  important  to  see  that  the  free  indices  in 
the  right  and  the  left  member  of  every  equation  are  denoted  in  the 
same  way.  For  example,  the  equations  (1)  may  be  replaced  by  the 
following  equivalent  notation: 

yk  =  Z  ak,xf  =  £  n*a.v0,  k  =  1 ,  . .  • ,  n 


but  it  is  not  permissible  to  change  the  notation  of  the  free  index  in 
only  one  member  of  an  equation. 

Later  on  we  will  seq  (Chapter  V)  that  in  many  cases  it  will  be 
convenient  to  write  the  indices  as  superscripts  (x\  aih,  and  so 
forth).  It  is  important  that  in  the  course  of  a  computation  the  su¬ 
perscripts  remain  superscripts  and  the  subscripts  remain  subscripts 
if  they  are  free  indices. 

Later  on  we  will  have  to  do  with  summation  symbols  carrying 
several  dummy  indices.  In  such  cases,  the  independent  dummy  in¬ 
dices  must  be  denoted  by  distinct  letters,. 


56 


LINEAR  TRANSFORMATIONS  OI  VARIABLES 


[CM.  II 


For  example, 

Z  Bblyy'  =  2p»  Z  Bk  (Z  AfiX^y1)  =  Z 

The  indices  /'  and  p  remain  free  (this  means  that  the  preceding 
line  replaces  n2  equations). 

4.  Remark.  Considerable  use  in  the  literature  is  made  of  a  still 
more  compact  notation  in  which  not  only  the  limits  of  the  summa¬ 
tion  and  the  indices  are  dropped  but  even  the  summation  symbol 
as  well.  In  that  notation,  we  have,  for  instance, 

A[z*  =  Z  A'tZp 

fe=i 

The  use  of  such  notation  requires  well-developed  habits  on  the  part 
of  the  reader.  In  this  text  we  will  not  drop  the  summation  symbol. 

§  2.  Linear  transformation  of  variables.  The  product  of  linear 
transformations  of  variables  and  matrix  products 

1.  Let  xn  be  an  ordered  n-tuple  of  independent  variables. 

Suppose  we  have  an  m  X  «  matrix: 


bn  . 

•  bln 

bm  i  • 

■  bm  n 

We  can  write  the  relations 

D\  —  b\\x\  +  •  •  •  +  binxn, 

Um  =  b,n\X  |  +  ...  -f b  rnn-^n 


(i) 


where  y i,  . . . ,  ym  denote  the  numerical  values  of  the  right  members 
of  (1).  It  is  clear  that  y\ . ym  vary  with  x\t  . . . ,  jc„. 

The  set  of  relations  (1)  is  called  a  linear  transformation  of  the 

variables  . . .  .v„  into  the  variables  y i,  . . . ,  ym.  The  numbers 

are  called  the  coefficients  of  the  linear  transformation  (1).  The 
matrix  B  made  up  of  these  coefficients  is  termed  the  matrix  of  the 
given  linear  transformation.  When  specified,  the  matrix  determines 
the  linear  transformation  (1). 

Remark.  The  ordered  n-tuple  of  variables  xlt  ....  x„  might  be 
regarded  as  a  variable  point  in  coordinate  space.  We  can  interpret 

. . .  geometrically  in  exactly  the  same  way.  However,  it  is 

advisable  to  define  a  linear  transformation  of  variables  as  a  purely 
arithmetic  (or  algebraic)  concept  and  not  relate  it  beforehand  to 
any  geometric  conceptions. 


THE  PRODUCT  OF  LINEAR  TRANSFORMATIONS 


57 


5  2] 


2.  Given  a  p  X  m  matrix 


aII  • 

■  Q\m 

Up\  . 

■  Qpm 

Let  us  write  down  the  corresponding  linear  transformation  of  va¬ 
riables  in  the  form 


=  01101  +  •  • 

•  •  +  | 

(2) 

=  Opl0l  +  ■ 

■  •  “1“  Q p mil m  ' 

Here,  the  independent  variables  are  denoted  by  yu  . . . ,  ym, 
although  the  relations  (2)  are  at  first  considered  irrespective  of 
relations  (t). 

At  the  same  time,  we  might  regard  the  t/i,  . . . ,  ym  in  the  right- 
hand  members  of  (2)  as  the  same  quantities  defined  in  (1)  with 
respect  to  the  variables  jcj,  . . . ,  xn,  in  which  case  Z\,  ....  zv  be¬ 
come  functions  of  the  independent  variables  xit  xn,  and 
. . .  have  the  role  of  intermediaries. 

If  the  intermediate  variables  y i,  ...,  ym  are  eliminated  from  (1) 
and  (2),  then  zt,  . . . ,  zp  will  be  expressed  explicitly  in  terms  of 
X\,  ....  xn.  To  carry  out  this  operation,  replace  the  t/j  in  the  right 
members  of  (2)  by  their  expressions  in  (1).  Then  in  each  equation 
of  (2)  each  of  the  variables  X\,  . . . ,  xn  will  occur  m  times.  Collect¬ 
ing  like  terms  and  denoting  the  resulting  coefficients  by  cn,  ci2,  . . . , 
we  have 


*1=011*!  +  • 

•  “I”  n^n*  ^ 

zp  =  cp  ,.V,  +  . 

•  +Cpnxn  ) 

We  thus  get  another  linear  transformation  of  the  variables;  the 
matrix  is 


c  11  • 

•  c  1/1 

CP 1  • 

•  Cpn 

Definition.  The  linear  transformation  of  variables  (3)  obtained 

by  eliminating  y\ . i/,„  from  (2)  and  (1)  is  called  the  product 

of  the  linear  transformation  of  variables  (2)  by  the  linear  trans¬ 
formation  of  variables  (1).  Here,  the  p  X  «  matrix  C  is  termed  the 
product  of  the  p  X  m  matrix  A  by  the  in  X  n  matrix  B.  Symboli¬ 
cally  we  have 

C  =  AB 


3.  Let  us  now  find  a  formula  to  express  any  element  of  the 
matrix  C  in  terms  of  the  elements  of  the  matrices  A  and  B.  To  do 


58 


LINEAR  TRANSFORMATIONS  OF  VARIABLES 


[CIL  II 


this  wo  have  to  actually  eliminate  the  quantities  ylt  ....  ym  from 
Hie  relations  (1)  and  (2),  and  not  merely  make  that  statement. 
To  reduce  the  amount  of  computation,  we  write  system  (1)  more 
'compactly  as 

n 

#/=  Z  b,kxk 

k=  I 

where  /  =  1,  2,  m  correspond  to  the  numbers  of  the  equa¬ 
tions.  System  (2)  is  written  down  similarly: 

m 

Zi  =  'L'u.i'Jl 
/=! 

where  i  =  1,  . . . ,  p.  We  now  have 

m  /  n  \  n  /  in  \ 

Zi  =  Z  au  z  b/kxkJ  =  Z  (Z  aUbik)  Xk  (4) 

On  the  other  hand,  we  can  abbreviate  the  equations  (3)  to 

n 

Zi  =  Z  cikxk  (5) 

k—  1 

From  (4)  and  (5)  we  get 

m 

cik  —  ^aUblk  (6) 

or 

cik  —  ailbik  +  «i2^24  +  •  •  •  +  Qlmbrnk  (7) 


Expression  (7)  is  called  the  product  of  the  t'th  row  of  matrix  A 
by  the  fcth  column  of  matrix  B  (by  analogy  with  the  familiar  ana¬ 
lytic-geometry  formula  expressing  the  scalar  product  of  vectors 
in  terms  of  their  components). 

The  number  of  columns  of  matrix  A  must  be  equal  to  the  number 
of  rows  of  matrix  B,  otherwise  the  product  AB  is  not  defined.  The 
number  of  rows  and  columns  in  the  product  may  be  expressed  ac¬ 
cording  to  the  following  scheme: 

(pXm)'  (mXn)  =  (pXn) 

The  two  products  AB  and  BA  are  simultaneously  defined  if  and 
only  if  A  and  B  are  square  matrices  of  the  same  order. 

Remark  1.  Of  course  matrix  products  could  have  been  defined 
directly  with  the  aid  of  formula  (7)  and  without  taking  into 
account  its  origination  from  linear  transformations. 

Remark  2.  Generally  speaking,  matrix  multiplication  is  not  com¬ 
mutative,  as  is  readily  seen  from  some  examples.  Let 


/t== 


t  0 
0  0 


B  = 


0  1 
0  0 


THE  PRODUCT  OF  LINEAR  TRANSFORMATIONS 


59 


S  2] 
Then 


0  0 
0  0 


AB  = 


0  1 

0  0 

=£BA  = 

4.  Let  us  write  the  sets  of  variables  x,,  yjt  zk  in  the  form  of 
column  matrices: 


*i 

y\ 

xn 

,  Y  = 

Hm 

,  z  = 

Zp 

Then  the  formulas  for  transforming  variables  (1),  (2)  and  (3) 
may  be  written  in  the  form  of  matrix  equations: 

Y  =  BX,  Z  —  AY,  Z  =  CX 


where  C  =  AB. 

5.  The  following  are  a  number  of  identities  expressing  the  pro¬ 
perties  of  matrix  multiplication: 

(1)  A  (BC)  —  (AB)C  (associativity).  By  this  property,  the  pro¬ 
duct  of  three  matrices  ABC  can  be  written  without  parentheses. 

(2)  (aA)B  =  A(aB)  =  aAB; 

(3)  A  (B  +  C)  =  AB  +  AC;  ( B  +  C)A  =  BA  +  CA. 

Here,  a  is  an  arbitrary  scalar,  A,  B,  C  are  arbitrary  matrices  in 
which  the  number  of  columns  and  rows  ensure  the  performance  of 
the  foregoing  operations. 

The  proof  of  identities  (1),  (2),  and  (3)  is  elementary  and  we 
do  not  give  it  here. 

6.  There  is  one  more  matrix  operation  that  will  be  used  in  the 
sequel.  It  is  called  transposition  or  taking  the  transpose  of  a 
matrix.  We  denote  it  by  A*,  which  is  the  matrix  obtained  from 
matrix  A  by  replacing  the  rows  by  the  corresponding  (as  to  num¬ 
ber)  columns.  The  following  obvious  identities  occur: 

(4)  M  +  B)*  =  /l*  +  B‘; 

(5)  (aA)*  =  aA\ 

And  also 

(6)  (ABy  =  B*A\ 

The  validity  of  the  last  identity  is  quite  evident  if  we  take  into 
account  the  fact  that  the  product  of  A  by  B  is  constructed  as 
follows:  row  A  into  column  B. 


60 


LINEAR  TRANSFORMATIONS  OF  VARIABLES 


[CH.  II 


§  3.  Square  matrices  and  nonsingular  transformations 

I.  In  this  section  we  will  discuss  in  more  detail  the  linear 
transformations  under  which  the  number  of  variables  is  preserved. 
These  transformations  are  associated  with  square  matrices. 


2.  It  will  be  recalled  that  the  determinant  of  an  n  by  n  matrix 
A  =  Hoijll  is  the  quantity 

det  A  =  £  S«, -„«<■  ,ia;22  ...  (1) 

‘i .  '« 

where 

'  0  if  there  are  identical  numbers  among  the 

numbers  i,,  i->,  . .., 

'i':  -hi  -(-  1  if  the  permutation  (/,,  i2,  ....  i„)  is  even; 

k  —  1  if  the  permutation  (zt ,  i2,  . ..,  in)  is  odd 


The  indices  ii,  i2,  ...,  i„  take  on  the  values  1,2,  ...,  n. 

In  (1),  the  second  indices  of  the  elements  of  matrix  A  are  taken 
in  their  natural  order.  For  any  other  arrangement  of  them  and 
also  in  the  case  of  repetitions,  we  have 


I  V- W.  •••  ainln  =  ...  /rtdet  A 


(2) 


We  will  now  prove  a  theorem  that  will  come  in  handy  later  on. 
Theorem.  The  determinant  of  a  product  of  matrices  is  equal  to 
the  product  of  their  determinants: 

det  AB  —  det  ^4detB 

Proof.  Using  (1),  (2)  and  formula  (6)  of  the  preceding  section 
we  get 

det  AB  —  det  C  =  £&!•••  ,  ■  ...  clrt„ 


—  Z  (det  A)  6/j ...  jbjp  . . .  bjnn  =  det  A  ■  det  B 


which  is  what  we  set  out  to  prove. 


3.  A  square  matrix  is  said  to  be  nonsingular  (that  is,  invertible) 
if  its  determinant  is  not  equal  to  zero.  The  linear  transformation 

of  variables  . . .  into  the  variables  yu  ■  ■  ■ ,  yn  is  said  to  be 

nonsingular  if  it  has  a  nonsingular  square  matrix. 


NONSINGULAR  TRANSFORMATIONS 


61 


S  3] 

By  Subsection  2,  the  product  of  nonsingular  matrices  is  a  non¬ 
singular  matrix.  The  product  of  nonsingular  linear  transformations 
of  variables  is  a  nonsingular  linear  transformation  of  variables. 

A  nonsingular  nX"  matrix  A  has  rank  r  —  n.  This  is  clear 
since  the  determinant  of  such  a  matrix  is  the  basis  minor  of  the 
matrix.  A  singular  matrix  has  rank  r  <  n.  This  is  also  clear  since 
the  determinant  of  a  singular  matrix  is  equal  to  zero  and,  hence, 
the  basis  minors  of  the  matrix  have  order  less  than  n  (or  they  are 
absent,  but  then  r  =  0).  It  is  the  reduction  in  the  rank  of  the 
matrix  (compared  with  the  standard  case  of  r  =  n)  that  makes 
for  singularity. 

4.  Suppose  we  have  a  nonsingular  linear  transformation  of 
variables  with  matrix  A  —  ||afj|| 


y  i  =«n*i  +  . 

•  +a lnxn,  | 

yn  =  a„  |*|+  • 

•  “1“  &nnXn  ) 

We  introduce  the  notation 


A  —  det  A 

By  hypothesis,  A  ^  0.  In  that  case,  system  (3)  with  arbitrary  given 

Hi . yn  and  unknown  xit  ....  x„  has  a  unique  solution.  Let  us 

find  it.  It  is  convenient  to  use  the  Cramer  formulas.  By  these  for¬ 
mulas,  Xh  is  expressed  in  the  form  of  a  fraction  whose  denominator 
is  A  and  whose  numerator  is  obtained  by  replacing  the  6th  column 
of  A  by  a  column  made  up  of  the  quantities  y i,  . . . ,  yn.  Expanding 
the  determinant  in  the  numerator  of  such  a  fraction,  we  get 

xk  =  +  •••  +  A.lkyn) 

where  A ia,  . . . ,  Anh  denote  the  cofactors  of  the  elements  of  the  6th 
column  of  the  determinant  of  matrix  A.  Thus 

x, 


The  equations  (4) ,  which  express  .v,, . . . ,  xn  in  terms  of  yi, . . . ,  yn 
from  formula  (3),  constitute  a  linear  transformation  of  variables; 
it  is  the  inverse  of  the  linear  transformation  (3),  where  ylt  ... ,  ya 
are  expressed  in  terms  of  xlt  .  .  . ,  ,v„. 


A 1 1 


y  i  + 


+  -¥'Jn 


■y  i  + 


+  4=-!/. 


(4) 


r.2 


UNEAR  TRANSFORMA IIONS  OF  VARIABLES 


[CM.  II 


The  matrix  of  transformation  (4)  is  said  to  be  the  inverse  of 
the  (nonsingular)  matrix  A  of  the  transformation  (3).  It  is  denoted 
by  A~K  Thus,  if 


then 


A  = 


a\\  •  ■ 
anl  ■  • 

A~'  = 


•  Q\n 

,  A  == 

•  @nn 

^ll 

Anl 

A 

*  ‘  *  A 

^  in 

Ann 

A 

•  '  ‘  A 

Note  that  in  A  1  we  find,  located  in  the  rows,  the  cofactors  of 
those  elements  of  A  which  in  A  are  located  in  the  corresponding 
(as  to  number  label)  columns. 


5.  Since  equations  (4)  express  the  solution  of  system  (3)  with 
respect  to  the  unknowns  xit  . . . ,  x„,  substitution  of  the  expressions 
(4)  into  the  right-hand  members  of  (3)  should  give  the  following 
result: 


y\=y  i. 

tji  =  t/2.  , 

Vn  =  yn  . 


(5) 


This  is  called  an  indentity  transformation.  Its  matrix  is  denoted 
by  the  letter  /  or  E  and  is  called  the  unit  matrix  (or  identity 
matrix): 


1  0  . . .  0 

0  1  ...  0 


0  0  . . .  1 


Since  the  product  of  the  transformation  (3)  by  the  inverse  trans¬ 
formation  (4)  is  an  identity  transformation,  (5),  correspondingly 
the  product  of  the  matrix  A  by  the  inverse  matrix  A~'  is  a  unit 
matrix,  E: 


AA~'  —  E 


(6) 


6.  By  Subsection  2,  we  have  from  (6) 

(det  4)-(deM~')=l 

whence  det  A  1  #  0,  and  the  matrix  A _l  is  nonsingular.  Then  the 
solution  of  system  (4)  for  t/i,  ...,  yn  is  unique  and,  hence,  coin- 


§3]  NONSINGULAR  TRANSFORMATIONS  63 

cides  with  the  expressions  (3).  For  this  reason,  substitution  of  (3) 
into  the  right  members  of  (4)  must  yield  the  identity  transforma¬ 
tion 

*1  =*!• 

X->  —  x->. 


X,i 


At  the  same  time  we  get 

A~'A  =  E 

7.  To  summarize,  every  nonsingular  linear  transformation  of 

the  variables  xlt  .  .  . ,  x„  into  . y„  has  a  unique  inverse  linear 

transformation  of  the  variables  i/i,  ....  yn  into  Xi . xn.  It  too 

is  nonsingular  and  its  inverse  is  the  original  transformation. 
Every  nonsingular  matrix  has  a  unique  inverse,  which  is  also  non¬ 
singular  and  the  inverse  of  which  is  the  original  matrix. 

8.  The  notions  of  an  inverse  transformation  and  an  inverse 
matrix  may  be  defined  in  a  somewhat  different  manner.  Namely, 
suppose  we  have  two  linear  transformations  of  variables,  which 
we  write  compactly  as 

Vi  =  Z  a,/*/  (7) 

with  the  n  X  n  matrix  A  —  ||a,J  and 

*/=  Z  b,kzk  (8) 

with  the  n  X  n  matrix  B  =  ||  bjh  ||.  We  can  then  speak  of  the  trans¬ 
formation  (8)  as  being  the  inverse  of  (7)  if  the  product  of  (7)  by 
(8)  is  the  identity  transformation 

Ui  =  Zi  (i=l,  ....  n)  (9) 

Correspondingly,  the  matrix  B  may  be  called  the  inverse  of  A  if 

AB  =  E  (10) 

The  nonsingularity  of  the  transformation  (7)  and  the  matrix  A 
need  not  be  specially  stipulated,  for  it  unavoidably  follows  from 
(10),  because,  due  to  (10),  (det  A)  ■  (det  B)  —  1,  and  therefore 
det  A  0.  But  provided  det  A  =^=  0,  the  system  (7)  with  unknowns 
X\,  ...,  xn  and  knowns  yit  ...,  yn  has  a  unique  solution.  Hence, 
the  expressions  (8),  with  account  taken  of  (9),  that  is,  with  ac¬ 
count  of  the  fact  that  zh  —  yh,  must  coincide  with  the  expressions 
(4).  This  brings  us  back  to  our  original  definition  of  an  inverse 
transformation  and  an  inverse  matrix. 


r>i 


I  I.'fr.AR  TRANSFORMATIONS  OF  VARIABI.FS 


[GIF  IT 


9.  If  A  and  B  arc  nonsingular,  then  we  have  the  identity 

=  B~' A"' 

Proof.  Let  B~'A~X  =  C,  then 

( AB )  C  =  (AB)  ( B~'A  “')  =  A(BB~')  A~'  =  AEA~ 1  =  ( AE )  A*1 

=  AA~'  =  E 

Hence,  C  =  (AB)-'  by  Subsection  8,  (10). 

10.  In  the  preceding  subsection  we  took  advantage  of  the  fact 
that  the  following  readily  verifiable  identity  holds  true  for  any 
nX«  matrix  A:  AE  =  A  (and  also  EA  —  A),  where  E  is  the  unit 
matrix.  We  can  say  that  the  matrix  £  plays  the  same  role  in  matrix 
multiplication  as  does  the  number  unity  in  the  multiplication  of 
ordinary  numbers. 

Remark.  It  is  easy  to  prove  that  there  is  no  other  matrix  with 
that  property.  What  is  more,  if  for  any  one  nonsingular  matrix  A 
the  equation  AEt  —  A  holds,  then  E\  —  E.  Indeed, 

£,  =  EE{  =  (/TU)  £,  =  XT 1  (AEt)  =  A~'A  =  E 

11.  Together  with  the  multiplication  of  matrices  are  defined  the 
natural  powers  of  a  matrix:  A'1  =  AA,  A3  =  AAA,  and  so  on.  Thus 
is  defined  the  notion  of  a  polynomial  of  a  matrix: 

P  {A)  =  aoAn  +  a.\A'  1  +  +c£„_,4  +  a„£  (11) 

where  ao,  ■  •  • ,  an  are  scalars  and  P(A)  is  an  n  X  n  matrix. 

It  is  a  general  agreement  that  any  n  X  n  matrix  A  to  the  power 
of  zero  is  equal  to  the  unit  n  X  n  matrix  £: 

A3  —  E 

Therefore,  the  term  a „£  in  (11)  plays  the  part  of  a  constant  term. 

§  4.  The  rank  of  a  product  of  matrices 

I.  Given  an  tti  X  n  matrix  A  =  ||a,  j||,  an  n  X  p  matrix  B  = 
=  Hftjill  and  an  mXp  matrix  C=  Nc.-jll  equal  to  their  product: 
C  =  AB.  The  following  theorem  holds. 

Theorem  1.  The  rank  of  a  product  of  matrices  does  not  exceed 
the  rank  of  any  one  of  the  factors,  that  is, 

rank  AB  <  rank  A,  (1) 

rank  AB  <  rank  B  (2) 

Proof.  We  establish  inequality  (2),  but  first  let  us  note  two 
trivial  eases. 


THE  RANK  OF  A  PRODUCT  OF  MATRICES 


65 


§  4] 

(1)  If  rank  B  —  0,  then  matrix  6  is  a  zero  matrix.  In  that  case 
C  —  AB  is  a  zero  matrix  too,  rank  C  =  0  and  (2)  holds. 

(2)  If  rank  B  =  p  (where  p  is  the  number  of  columns  in 
matrix  B),  then  (2)  also  holds,  since  rank  C  does  not  exceed  the 
number  of  columns  p  in  the  matrix  C. 

Suppose  rank  B  =  r  and  assume  that  0  <  r  <  p. 

On  this  assumption,  matrix  B  has  a  certain  system  of  r  basis 
columns  and  at  least  one  more  column  not  belonging  to  that 
system.  For  the  sake  of  definiteness,  suppose  the  first  r  columns 
of  B  are  basis  columns.  We  consider  any  column  k,  k  >  r.  By  the 
basis-minor  lemma,  it  is  expressed  linearly  in  terms  of  the  basis 
columns: 


f>ik 

bn 

b\r 

bnk 

II 

a 

bn  i 

+ 

•  +  ar 

^  ■  ■  • 

In  abbreviated  notation,  this  expression  becomes 

r 

bj  'c~  Z  as^ls  (3) 

S=1 

where  j  =  1,  2,  ....  n  and  the  index  k  is  fixed. 

On  the  other  hand,  by  the  definition  of  a  product  of  matrices  we 
have 

n 

cik  ~  Z  «//&/*  (4) 

From  (3)  and  (4) 

Cik  =  Z  as/  ( Z  aA>)  =  Z  («s  Z  aubh )  =  £  ascts  (5) 

where  1  sg:  t  ^  m,  the  index  k  is  fixed  and  has  the  same  numerical 
value  as  in  the  formula  (3).  Equations  (5)  show  that  in  the 
matrix  C  any  column  with  k  >  r  can  be  linearly  expressed  in 
terms  of  the  first  r  columns: 


C\k 

—  al 

C\\ 

+  ■ 

•  +  ctr 

C\r 

Cmk 

1 

cm\ 

cmr 

whence  and  by  virtue  of  Lemma  3  of  Section  6,  Chapter  I,  the 
rank  of  the  system  of  all  columns  of  matrix  C  does  not  exceed 


3—661 


fiG  LINEAR  TRANSFORMATIONS  OF  VARIABLES  [CII.lt 

r  that  is,  rank  C  ^  r,  and  the  proof  of  inequality  (2)  is  comp¬ 
lete. 

In  order  to  prove  inequality  (1)  it  is  now  sufficient  to  pass  to 
the  transposes.  Indeed,  by  Property  6  of  Subsection  6,  Section  2, 

C*  =  (ABi*  =  B*A*  (6) 

From  (6)  and  due  to  the  established  inequality  for  the  second 
factor  we  have 

rank  C  —  rank  C‘  <  rank  A *  =  rank  A 

This  completes  the  proof  of  the  theorem. 

2.  To  avoid  any  misunderstanding,  it  will  be  well  to  make  the 
following  warning  with  respect  to  the  proof  of  inequality  (2).  As 
has  been  demonstrated,  any  column  number  k  in  matrix  C,  k  >  r, 
can  be  linearly  expressed  in  terms  of  the  first  k  columns.  We  do 
not  know  whether  these  columns  are  independent  or  not.  There¬ 
fore  we  cannot  assert  that  rank  C  =  r.  Simple  numerical  examples 
will  make  it  clear  that  such  an  assertion  is  not  only  unfounded 
but  actually  erroneous. 

3.  Given:  an  arbitrary  m  X  n  matrix  A,  a  nonsingular  square 
nXn  matrix  B  and  a  nonsingular  square  m  X  m  matrix  B'. 

The  following  theorem  holds. 

Theorem  2.  The  ranks  of  the  products  AB  and  B'A  are  equal  to 
the  rank  of  matrix  A,  that  is, 

rank  AB  =  rank  A  if  detfl^O  (7) 

and 

rank  B'A  =  rank  A  if  det  B'  0  (8) 

In  other  words,  the  rank  is  preserved  under  multiplication  (on 
the  left  or  on  the  right)  of  the  given  matrix  by  a  nonsingular 
matrix. 

Proof.  Let  C  =  A-  B.  Then  rank  C  ^  rank  A  by  Theorem  1.  But 
due  to  the  nonsingularity  of  B  we  have  A  =  ABB =  CB~l  so 
that  rank  A  ^  rank  C  by  Theorem  1.  This  completes  the  proof  of 
(7).  Equation  (8)  is  proved  in  similar  fashion. 


§  5.  Transformation  of  coordinates  in  a  change  of  basis 

I.  In  many  cases  one  finds  it  necessary  to  make  a  change  of 
basis.  We  will  now  derive  formulas  according  to  which  the  coor¬ 
dinates  of  an  arbitrary  vector  are  transformed  when  passing  to  a 
new  basis. 


§5! 


TRANSFORMATION  OF  COORDINATES 


67 


2.  Let  L„  be  an  n-dimensional  linear  space  and  let  e\,  . ...  e„ 
be  its  basis.  Lete^  e'n  be  the  new  basis  in  Ln.  We  expand 
each  of  the  vectors  e\,  . . .,  e'n  relative  to  the  old  basis  to  get 

e\  =  P\\e\  -f-  P 1262  +  .  .  .  +  P In^ni 

e'l  =  Pi |£|  -f  Pne<i  4"  ...  4"  Pinfin, 

e’n  —  Pn\B\  4-  P ni&2  4"  •  •  •  4*  PnnGn 

The  coefficients  of  these  expansions  constitute  the  square  n  X  « 
matrix 

Pii  ...  P |„ 

P=  . 

P  P 

r n\  •  •  •  r  nn 

3.  The  ith  row  of  matrix  P  is  formed  by  coordinates  of  the  vector 

e'i  in  the  original  basis.  Since  the  new  basis  vectors  e\ . e'n  are 

independent,  the  matrix  P  is  nonsingular,  that  is, 

det  P  =4=  0  (1) 

On  the  other  hand,  if  we  arbitrarily  take  matrix  P  under  con¬ 
dition  (1),  then  the  vectors  e\,  ...,  e'n  defined  by  the  equations 
(I)  will  be  independent  and,  hence,  will  constitute  a  basis.  Thus, 
the  transition  to  any  new  basis  is  determined  by  specification  of 
the  matrix  P  under  the  sole  condition  that  it  be  nonsingular. 

4.  Let  x  be  an  arbitrary  vector  in  Ln.  Let  us  expand  it  in  terms 
of  the  old  basis  and  the  new  basis: 

x—Yj  *47-  x  =  Y  xM 
I  i 

We  can  write  the  formulas  (I)  compactly  as 

n 

e'i  =  Y  Pii^i,  <  =  1 »  .  • . ,  n  (I) 

/=! 

whence  and  from  the  second  expansion  of  x  we  have 

X  =  Y  x’i  (  Y  Pifi /)  =Y(Y  Pi/X,')  e, 

Comparing  this  expansion  with  the  first,  we  find 

n 

xl=YPiix'i,  j=\,...,n  (II) 

i=i 


3* 


f,s  LINEAR  TRANSFORMATIONS  OF  VARIABLES  [CIL  IF 

or,  written  out  in  full, 

X\  —  P\\x'\  -f-  Pl\x'l  +  •  •  •  +  PnlXn, 

Xl  =  PaX'i  -f  P22X2  +  .  .  .  +  P n2  Xn, 

Xn  =  PlnX'i  -f-  PlnXt  +  .  .  .  +  PnnXn 

Formulas  (II)  express  the  old  coordinates  . . .  of  the 

vector  x  in  terms  of  its  new  coordinates  x\,  . . xh  and  constitute 
a  linear  transformation  of  the  variables  x[,...,  xh  into  the  vari¬ 
ables  X[ . xn.  The  matrix  of  this  transformation  is  P*,  which 

is  the  transpose  of  matrix  P.  Therefore,  denoting  by  X  and  X'  the 
column  matrices  consisting  of  the  old  and  new  coordinates  of  the 
vector  x,  we  can  write  formulas  (II)  as  a  matrix  equation: 

X  =  PV  (I  la) 

The  transformation  (II)  is  nonsingular  since 

d-t  P*  =  ditP=^0 

Thus,  any  change  in  the  basis  is  associated  with  a  nonsingular 
linear  transformation  of  the  coordinates  of  every  vector. 

5.  The  new  coordinates  of  a  vector  are  expressed  in  terms  of  the 
old  coordinates  by  means  of  a  linear  transformation  that  is  in¬ 
verse  to  the  transformation  (II). 

We  will  write  it  in  the  form 

x'\  —  Q11.V1  +  Q\2X2  +  ...  -f-  QlnXn, 

X2  =  QilJCl  +  Q22X2  +  .  •  •  +  QlnXn, 

Xn  —  Qil*l  QnlXl  -t~  ...  — QnnXn 

or,  abridged, 

x'i=iQ„x„  /  =  1 ,  ....  n  (III) 

The  transformation  matrix  (III),  which  we  denote  by  Q,  is  the 
inverse  of  P*: 

Qn  •  •  ■  Qin 

Q=  .  =(PT'  (2) 

Qn\  •••  Qnn 

Note  that  each  one  of  the  elements  of  Q  is  equal  to  the  cofactor 
of  the  element  of  matrix  P  having  the  same  row  and  column 
number  labels  divided  by  det  P.  As  in  (Ila),  we  can  write 

X'^QX 


(Ilia) 


TRANSFORMATION  OF  COORDINATES 


69 


5  5] 

6.  The  equation  (2)  relating  matrices  P  and  Q  may  be  rewritten 
in  two  other  equivalent  modes: 

P'Q  -  E,  QP *  =  £  (3) 

It  will  be  useful  to  rewrite  equations  (3)  in  a  form  that  makes 
use  of  the  so-called  Kronecker  delta  6fj-  to  denote  elements  of  the 
unit  matrix  £.  By  definition, 

f°  if  i^j, 

6,7  t  1  if  i  =  i 

In  this  way,  the  two  matrix  equations  (3)  are  replaced,  ac¬ 
cordingly,  by  two  systems  of  numerical  equations: 

Z/,aQr/  =  6//.  =  6„  (4) 

In  the  former,  the  ith  row  of  P*  is  multiplied  by  the  / th  column 
of  Q,  in  the  latter,  the  kth  row  of  Q  is  multiplied  by  the  Ith  co¬ 
lumn  of  the  matrix  P*. 

7.  In  conclusion  note  that  any  nonsingular  linear  transforma¬ 
tion  of  the  variables  xt,  . . . ,  xn  into  the  variables  x\,  . . .,  x'n  may 
be  regarded  as  a  transformation  of  the  coordinates  of  vectors  in 
an  n-dimensional  linear  space.  Indeed,  if  we  are  given  the  trans¬ 
formation  (III),  then  we  know  the  matrix  Q  (det  Q#  0).  From 
this  we  find  P*  =  Q~‘  and  P  =(£*)*.  If  we  know  the  matrix  P, 
we  obtain  the  corresponding  basis  e\,  ...,  en  from  the  formu¬ 
las  (I). 


Chapter  III 


SYSTEMS  OF  LINEAR  EQUATIONS. 
PLANES  IN  AFFINE  SPACE 


§  I.  Affine  space 

1.  In  linear  space,  the  elements  are  vectors  regarded  as  entities 
involved  in  linear  operations.  However,  in  many  problems  atten¬ 
tion  is  focussed  on  geometric  facts  associated  with  the  mutual 
positions  of  figures  (subsets)  in  the  space  at  hand,  with  the  linear 
operations  of  secondary  interest. 

It  is  for  this  reason  that  along  with  a  linear  space  we  intro¬ 
duce  the  concept  of  an  affine  space  whose  elements  are  points.  The 
points  of  affine  space  are  related  in  a  definite  way  to  the  vectors 
of  linear  space  (in  much  the  way  as  is  done  in  elementary  analytic 
geometry).  The  conditions  for  these  relationships  are  given  in  the 
next  subsection  together  with  a  definition  of  an  affine  space. 

2.  Suppose  we  have  a  certain  set  21  whose  elements  will  be 

called  points  denoted  by  capital  letters  A ,  B . M,  ....  Also 

given  is  a  certain  linear  space  L.  Now  let  every  vector  in  L  be 
associated  with  an  ordered  pair  of  points  in  2J.  If  the  pair  of 
points  A,  B  is  associated  with  the  vector  x,  we  write  x  =  AB. 
Here,  the  symbol  AB  is  merely  a  different  notation  for  the  vector  x. 

The  first  of  the  two  points  is  the  origin  (tail)  of  the  vector  AB, 
the  second  point  is  the  terminus  (tip). 

Definition.  A  set  21  associated  with  a  linear  space  L  is  said  to 
be  an  affine  space  if  the  following  two  axioms  hold  true. 

(1)  For  every  point  A  in  21  and  for  every  vector  x  in  L  there  is 
a  unique  point  B  in  21  such  that  AB  —  x. 

(2)  If  AB  —  x,  BC  =  y,  then  AC  =  x  -f  y  (Fig.  5). 

An  affine  space  is  said  to  be  real  or  complex,  finite-dimensional 
or  infinite-dimensional,  if  the  corresponding  linear  space  L  is  res¬ 
pectively  real  or  complex,  finite-dimensional  or  infinite-dimensional. 
The  dimension  of  an  affine  space  21  is  the  number  equal  to  the 
dimension  of  the  linear  space  L. 

Remark  I.  F.very  linear  space  L  may  be  regarded  as  an  affine 
space  21.  It  suffices  merely  to  call  the  vectors  points  and  to  asso- 


AFFINF  COORDINATES 


71 


§  2] 

ciate  the  vector  b  —  a  e  L  with  each  pair  of  vectors  a,  b  consi¬ 
dered  as  points  of  the  set  'it. 

Remark  2.  Every  affine  space  21  may  be  regarded  as  a  linear 
space.  It  suffices  merely  to  specify  some  point  0  in  the  space  21. 
Then  with  any  arbitrary  point  M  e  21  is  associated  its  radius 
vector  OM.  The  set  of  radius  vectors  of  all  points  of  the  space  21 
is  what  constitutes  the  space  L. 


B 


Remark  3.  In  the  future,  we  will  always  indicate  which  space, 
affine  or  linear,  we  are  dealing  with.  However,  it  is  possible  to 
agree  to  consider  an  affine  space  with  an  indicated  point  in  it. 
Then  we  will  have  at  our  disposal  both  points  and  vectors. 

3.  Note  the  following  two  elementary  properties  of  an  affine 
space. 

Theorem  I.  Associated  with  every  pair  of  coincident  points  in  21 
is  the  zero  vector  of  L. 

Proof.  Assume  that  A  is  an  arbitrary  point  and  that  a  vector  z 
is  associated  with  the  pair  of  points  AA.  Let  x  be  an  arbitrary 
vector  in  L.  Then  by  the  first  axiom  of  an  affine  space  there  is  a 
point  B  such  that  AB  =  x.  Applying  the  second  axiom,  we  get 

x-{-z  =  z-\-x  —  AA  +  AB  =  AB  =  x 

hence,  2  =  0. 

Theorem  2.  If  AB  =  x,  then  BA  =  —x. 

Proof.  Let  BA  =  y.  Then 

x  +  y  =  AB  +  BA  =  AA  =  0 

whence  y  =  —  x. 

§  2.  Affine  coordinates 

1.  We  will  assume  that  the  given  affine  space  21  is  n-dimen- 
sional,  and  will  introduce  a  so-called  affine  system  of  coordinates. 

To  do  this,  in  21  we  choose  an  arbitrary  point  0,  called  the 
origin,  and  in  the  appropriate  linear  space  L  we  take  a  basis 
eh  . . . ,  en.  Let  M  be  an  arbitrary  point  in  21.  Together  with  the 


72 


SYSTEMS  OP  LINEAR  EQUATIONS 


|CH.  Ill 


coordinate  origin,  it  defines  a  vector  OM  e  L  called  the  radius 
vector  of  the  point  M.  Expanding  the  radius  vector  OM  in  terms  of 
the  basis  e,,  . . . ,  en,  we  get 

OM  =  x,e,  +  Xrje2  +  •  •  •  +  xnen 

The  coefficients  of  this  expansion,  X\,  . . . ,  xn,  are  called  the  affine 
coordinates  of  the  point  M  (referred  to  the  chosen  system  with 
origin  O  and  basis  et,  . . . ,  e„).  Note  that  an  affine  system  of  coor¬ 
dinates  is  given  by  two  unlike  entities:  the  point  0  in  affine  space 
and  the  basis  e\,  .  . . ,  e„  in  linear  space. 

The  coordinates  of  every  point  M  are  defined  uniquely  due  to 
the  uniqueness  of  the  expansion  of  the  vector  OM  in  terms  of  the 
basis  eu  ■  ■  ■ ,  en. 

2.  Let  there  be  given  another  arbitrary  point  N  with  coordi¬ 
nates  yit  . . . ,  yn.  We  will  show  how  the  coordinates  of  the  vector 
MN  are  expressed  in  terms  of  the  affine  coordinates  of  the  points  M 
and  N.  Taking  advantage  of  Axiom  2  and  Theorem  2  of  Section  1, 
we  find 

MN  =  MO  -f  ON  =  ON  —  OM  =  (*/,  —  xt)et  +  ...  +  (y ,  —  xn)  en 

so  that  the  vector  MN  has  the  coordinates  y\  —  Xi,  ...,  yn — xn. 

In  other  words,  to  obtain  the  coordinates  of  the  vector  MN, 
subtract  the  coordinates  of  the  origin  from  the  coordinates  of  the 
terminus  of  the  vector. 

3.  Retaining  the  chosen  basis,  we  translate  the  coordinate  origin 

from  0  to  0[.  We  denote  by  a\ . an  the  coordinates  of  0{  in 

the  original  system  and  will  assume  them  to  be  known.  We  then 
find  the  new  affine  coordinates  x\,  . . . ,  xn  of  the  arbitrary  point  M. 
(We  denote  the  old  coordinates  of  the  point  M  by  Xi,  ...,  x„.)  We 
have  the  vector  equation 

OM  =  00,  +  O^M 
or,  what  is  the  same  thing, 

x,e,+  ...  +xne„  =  a,el+  ...  +  anen  +  x,e,  -f  +  xnen 

whence,  due  to  the  uniqueness  of  the  vector  expansion  in  terms  of 
the  basis,  we  find 

x,=xi  +  ah  i=l,  ...,  n 

4.  If  the  coordinate  origin  remains  fixed  and  the  basis  undergoes 
change,  then  the  affine  coordinates  of  the  points  are  transformed 
in  the  same  way  as  the  coordinates  of  their  radius  vectors,  that  is, 
by  the  formulas  of  Section  5,  Chapter  II. 


PLANES 


73 


§  3] 

5.  Now  suppose  we  pass  from  the  given  affine  coordinate  system 
with  origin  0  and  basis  eit  .  . . ,  en  to  a  new  system  with  origin  O' 
and  basis  e\,  . ..,  e'„.  Here,  we  assume  as  known  the  coordinates 
of  O'  in  the  old  system  (au  . ..,  a„)  and  also  the  vector  expan¬ 
sions  of  the  new  basis  in  terms  of  the  old  basis: 

e'i  =  Z  paei 

Using  the  results  of  the  two  preceding  subsections  and  of  Sec¬ 
tion  5,  Chapter  II,  we  get  formulas  that  express  the  old  coordi¬ 
nates  x\ . xn  of  an  arbitrary  point  M  in  terms  of  its  new  coor¬ 

dinates  x\,  . . . ,  Xn, 

z  PliXl  +  a,,  1=1,...,  n  (1) 

Besides  formulas  (1),  we  have  the  inverse  formulas 

x'i  =  Z  Qu  (■*■/  —  ay)  =  Y.  Qu*l  +  a't 

where  Q=(P*)_1  (see  Section  5,  Chapter  II),  a\  =  —  £  Q,/?/, 
i  =  1,  . . . ,  n. 

Transformations  of  affine  coordinates  will  be  used  frequently  in 
the  sequel. 

§  3.  Planes 

1.  Suppose  in  n-dimensional  affine  space  9l„  we  have  a  fixed 
arbitrary  point  A  and,  in  the  corresponding  linear  space  Ln,  a  fixed 
arbitrary  /"-dimensional  subspace  Lr. 


Definition.  The  set  of  all  points  M  of  an  affine  space  such  that 
AM  e  Lr  is  called  an  r-dimensional  plane  passing  through  the 
point  A  in  the  direction  of  the  subspace  Lr  (see  Fig.  6,  where 
r  =  2). 

We  also  say  that  Lr  is  the  direction  subspace  of  this  plane.  It  is 
obvious  that  every  plane  uniquely  defines  its  direction  subspace. 

Point  M  is  called  the  running  point  of  the  plane.  Figure  6 
depicts  three  positions,  Mu  Af2,  M3,  of  the  running  point  Af. 

2.  Particular  cases.  (1)  If  r  =  0,  then  the  plane  consists  of  the 
single  point  A.  Therefore,  every  point  of  an  affine  space  may  be 
regarded  as  a  zero-dimensional  plane. 


74 


SYSTEMS  OF  LINEAR  EQUATIONS 


[Cl  I  III 


(2)  A  one-dimensional  plane  is  called  a  straight  line. 

(3)  A  plane  of  dimension  n  —  1  is  termed  a  hyperplane. 

(4)  When  r  —  n  the  plane  coincides  with  the  entire  space  9ln. 

3.  In  the  definition  of  a  plane  we  isolated  a  point  A.  We  will 
prove  that  in  reality  all  points  of  the  plane  are  of  an  equal  status. 

Denote  the  plane  by  Pr  and  take  a  fixed  arbitrary  point  B  in  Pr. 
We  have  to  prove  that  a  point  M  belongs  to  the  plane  PT  if  and 
only  if  BM  e  Lr  (that  is  to  say,  that  any  point  B  can  play  the 
role  of  point  A). 


Let  BM  <=  Lr  (Fig.  7).  By  the  definition  of  a  plane,  4B  e  L, 
whence  and  by  the  definition  of  a  subspace,  AM  —  AB  +  BM  e  Lr. 
Therefore,  M  e  Pr.  Conversely,  if  Ate  Pr,  then  AM  e  Lr;  conse¬ 
quently  BM  —  AM  —  AB  e  Lr. 

4.  Theorem.  Every  r-dimensional  plane  in  affine  space  is  itself 
an  r-dimensional  affine  space. 

Proof.  Given  an  affine  space  51  to  which  corresponds  a  linear 
space  L.  Let  Pr  be  a  plane  passing  through  a  point  A  in  the  di¬ 
rection  of  the  subspace  Lr.  In  the  plane  Pr  take  two  arbitrary 
points  M,  N.  By  the  definition  of  an  affine  space,  they  are  asso¬ 
ciated  with  the  vector  MN  e  L.  By  the  definition  of  a  plane,  the 
vectors  AM  and  AN  belong  to  the  subspace  Lr.  Hence, 

MN  =  AN  -  AM<=Lr 

Thus,  to  every  ordered  pair  of  points  M,  N  of  plane  Pr  is  asso¬ 
ciated  a  vector  MN  in  the  r-dimensional  linear  space  Lr.  Here,  the 
observance,  for  Pr,  of  the  first  of  the  axioms  of  Subsection  2,  Sec¬ 
tion  1,  follows  from  the  definition  of  an  r-dimensional  plane;  the 
second  of  the  axioms  of  Subsection  2,  Section  1,  holds  true  for  Pr 
because  it  holds  for  the  entire  affine  space  91.  This  completes  the 
proof  of  the  theorem. 

Remark.  If  the  plane  passes  through  the  origin  of  an  affine 
system  of  coordinates  in  the  direction  of  the  subspace  Lr,  then  the 


PLANES 


75 


§  3] 

aggregate  of  the  radius  vectors  of  its  points  forms  a  linear  space, 
which,  by  definition,  coincides  with  the  subspace  Lr. 

5.  In  the  affine  space  9t  let  there  be  given  r  +  1  points 
Ao,  A\,  . ..,  Ar.  We  say  that  these  points  are  in  the  general  posi¬ 
tion  if  they  do  not  belong  to  the  (r  —  1 ) -dimensional  plane  alone. 

The  reader  will  have  no  difficulty  in  verifying  that  the  points 
A0,  A\,  ....  Ar  are  in  the  general  position  if  and  only  if  the 
vectors  A0Alt  ...,  A0Ar  are  linearly  independent  (Fig.  8).  Note 
that  it  is  immaterial  which  of  the  points  is  taken  as  A0  (that  is, 
as  the  origin  of  the  vectors  issuing  from  it  to  other  points). 


A0 


From  the  foregoing  and  from  the  definition  of  a  plane  it  follows 
that  an  r-dimensional  plane,  and  only  one  such  plane,  passes 
through  the  system  of  points  A0,  A ,,...,  Ar  lying  in  the  general 
position. 

6.  Suppose  in  the  space  91  „  we  have  a  fixed  affine  system  of 
coordinates  with  origin  0  and  basis  eu  ....  en.  Let  us  consider  a 
plane  Pr  passing  through  point  A  in  the  direction  of  subspace  Lr. 

We  assume  that  A  has  coordinates  pu  ....  pn  and  that  Lr  is 
given  as  a  linear  hull  of  the  independent  system  of  vectors 
q i,  ...,  qr  (see  Chapter  1,  Section  13,  Subsection  5).  Then  the 
radius  vector  OM  of  the  running  point  of  the  plane  may  be  written 
as  follows: 

OM  =  OA  -f-  AM  =  O A  -f-  T|<7|  -f-  ...  -4-  x rqr  (1) 

where  the  parameters  xi,  . . . ,  tr  independently  run  through  all 
possible  numerical  values,  and  the  vector  OA  =  p\e{  -f  . . .  -j-  Pn^n 
(Fig.  9). 

Resolve  the  vectors  qit  . . . ,  qr  in  terms  of  the  basis  eit  . . . ,  en: 

Ql  —  Qi  iei  +  +  •  •  •  +  q  p fin 

As  usual,  we  denote  the  coordinates  of  the  running  point  M  by 
. . .  xn  and  write  down  the  vector  equation  (1)  in  coordinates 


76 


SYSTEMS  OF  LINEAR  EQUATIONS 


(CH.  Ill 


to  obtain  n  numerical  equations 

*i  =  <7 iib  +  <721*2  +  •  •  •  +  qr  rr  +  p„  ) 

.  (2) 

Xn  =  <7ln*l  +  <7.>„*2  +  •  •  •  +  <7  m*r  +  Pn  ) 

These  equations  are  called  the  parametric  equations  of  the  plane  Pr 
(in  the  given  system  of  coordinates).  Note  that  all  the  equations 
of  the  system  (2)  are  linear  in  the  coordinates  of  the  running  point 
and  in  the  parameters  xj. 

The  converse  is  also  true:  the  locus  of  points  defined  by  the 
equations  (2)  for  all  values  of  tj  is  a  plane  that  passes  through 


the  point  A  in  the  direction  of  the  subspace  Lr  =  L(q (,  ....  qr). 
Indeed,  the  equations  (2)  are  equivalent  to  the  vector  equation  (1), 
which  means  that  the  vector  AM  e  Lr. 

If  for  the  system  (2)  we  write  the  corresponding  homogeneous 
system  (that  is,  if  we  replace  p i,  ,  p„  by  zeros),  we  get  the  pa¬ 
rametric  equations  of  the  direction  subspace  of  the  plane  Pr. 

7.  Example.  The  space  studied  in  solid  geometry  is  a  three-di¬ 
mensional  affine  space,  in  which  the  one-dimensional  and  two-di¬ 
mensional  planes  coincide  respectively  with  the  straight  lines  and 
the  planes  of  elementary  geometry.  This  can  readily  be  proved  in 
a  variety  of  ways,  for  instance,  by  taking  advantage  of  the  results 
of  the  preceding  subsection  and  the  parametric  equations  of  a 
straight  line  and  a  plane  as  given  in  elementary  analytic  geo¬ 
metry. 

8.  Important  remark.  Metric  concepts,  such  as  distances  between 
points,  lengths  of  lines,  areas  and  volumes  of  figures,  angles  and 
perpendicularity,  are  defined  in  the  space  studied  in  elementary 
geometry,  hut  are  not  defined  in  affine  space.  In  affine  space,  one 
investigates  only  those  geometric  properties  of  figures  that  do  not 


SYSTEMS  OF  FIRST-DEGREE  EQUATIONS 


77 


«  4] 

depend  on  metric  notions.  Nevertheless,  such  investigations  are 
substantive  and  permit  solving  many  problems. 

9.  Before  proceeding  to  the  study  of  planes  (and  also  other 
figures)  in  affine  space,  we  give  in  the  next  few  sections  the  ne¬ 
cessary  basic  algebraic  apparatus:  systems  of  first-degree  equa¬ 
tions. 

§  4.  Systems  of  first-degree  equations 

1.  Let  us  consider  the  following  system  of  equations: 

Oii*i  +  aI2x2  +  ...  +alnxn  =&,.] 

^21*1  +022*2  +  ••■  +02«*rt  =b2,  f  (1) 

0m  1*1  +  0/n2*2  +  •  •  •  +  0 mnxn  =  ^m  ' 

The  letters  an,  ....  am„,  b ,,  ...,  bm  denote  given  numbers, 
stand  for  unknowns. 

The  numbers  a{j  are  called  the  coefficients  of  the  system  (1)  and 
form  an  m  X  0  matrix: 

au  ...  a[n 

A  =  . 

0m  1  •  •  •  0/7m 

wtfiich  is  called  the  basic  matrix  of  system  (1).  Henceforth  we  will 
assume  that  the  matrix  A  is  nonzero,  that  is,  that  there  are  coef¬ 
ficients  in  the  system  (1)  that  differ  from  zero. 

The  numbers  b\,  . . . ,  bm  are  called  the  constant  terms  of  the 
equations. 

The  matrix 

0,i  ...  a,„  6, 

B  =  . 

0m  1  •  ■  •  amnPm 

is  called  the  augmented  matrix  of  the  system  (1). 

Any  ordered  n-tuple  of  numbers  . . .  xn  whose  substitution 

in  place  of  the  unknowns  makes  all  the  equations  of  the  system 
arithmetic  identities  is  called  a  solution  of  the  system.  A  system 
is  said  to  be  consistent  if  it  has  at  least  one  solution. 

2.  Kronecker-Capelli  theorem.  For  system  (1)  to  be  consistent, 
it  is  necessary  and  sufficient  that  the  rank  of  its  augmented 
matrix  be  equal  to  the  rank  of  the  basic  matrix: 


rank  B  =  rank  A 


(2) 


SYSTEMS  OF  LINEAR  EQUATIONS 


ICH.  lit 


78 


Proof.  (!)  Necessity.  It  is  clear  that 

rank  6^  rank  A  (3) 

Denote  by  <Z|,  an  the  columns  of  matrix  A  and  by  b  the 
column  of  constant  terms,  and  then  regard  all  these  columns  as 
vectors  in  the  coordinate  space  Km.  Let  system  (1)  have  the  solu¬ 
tion  X\t  . . . ,  xn.  This  solution  converts  the  equations  to  a  system  of 
numerical  identities,  which  may  be  written  down  as  a  single  vector 
equation: 

*|0|  +  x2a2  -f  . . .  +  xnan  =  b  (4) 

From  (4)  it  follows  that  the  vectors  a\ . a„,  b  are  linearly  ex¬ 

pressible  in  terms  of  the  vectors  a,,  . . . ,  a„.  By  Lemma  3  of  Sec¬ 
tion  6,  Chapter  I,  and  by  virtue  of  the  definition  of  the  rank  of  a 
matrix  we  have 

rank  Shrank  A  (5) 

From  (3)  and  (5)  follows  (2). 

(2)  Sufficiency.  Let  (2)  hold  true.  Matrix  A  is  by  hypothesis  non¬ 
zero,  it  therefore  has  a  basis  minor  of  order  r  —  rank  A  >  0. 

For  the  sake  of  definiteness,  let  us  assume  that  the  first  r  co¬ 
lumns  a\,  . . . ,  ar  are  basis  columns.  Consider  the  system  of  vectors 

a . .  ar,b.  This  system  is  linearly  dependent,  for  otherwise 

rank  B  =  r  -f  1  >  r.  Therefore,  the  vector  b  can  be  expressed  in 

terms  of  the  linearly  independent  vectors  a< . a,  (see  Chapter  1, 

Section  6,  Lemma  2) : 

b  =  Xiai-\-  •••  +  hrar  (6) 

Set 

X\  X],  . . . ,  xr  =  \r,  xr+ 1  =  . . .  =  xn  =  0  (7) 

Writing  system  (l)  in  the  vector  form  (4)  and  substituting  into 
it  the  quantities  (7),  we  get  the  identity  (6).  Thus,  system  (1)  is 
consistent  and  therefore  has  at  least  one  solution  (7).  The  proof 
is  complete. 


3.  We  note  the  particular  case  where  the  number  of  equations 
is  equal  to  the  number  of  unknowns  and  matrix  A  (square  in  this 
case)  is  nonsingular,  that  is, 


D  —  det  A  = 


an  . 

•  aln 

&n\  • 

■  ann 

¥=  0 


Then  system  (1)  has  a  unique  solution  which  may  be  found  by 
Cramer’s  rule: 


l) 


x2- 


n.jb) 


r>n(b ) 


D 


D 


(8) 


SYSTEMS  OF  FIRST-DEGREE  EQUATIONS 


79 


§  4] 

Here  Dj(b )  denotes  the  determinant  obtained  from  D  by  replacing 
the  yth  column  by  the  column  of  constant  terms,  that  is, 

a ii  •  •  •  b{  at/+l  ...  aln 
D,(b)=  (9) 

Cln\  ■  ■  ■  Onj—  I  b.-i  ^«/  +  I  •  •  • 

Remark  1.  If  x  denotes  the  vector  (xi,  ...,  x„)  written  as  a 
column,  then  system  (1)  can  be  represented  as  the  matrix  equation 

Ax  =  b  (la) 

and  Cramer’s  formulas  (8)  as  the  matrix  equation 

x  —  A~'b  (8a) 

The  transition  from  (la)  to  (8a)  is  attained  by  multiplying  both 

members  on  the  left  by  the  matrix  A~'. 

Remark  2.  Formulas  (8)  are  not  convenient  for  practical  solving 
of  systems  with  large  numbers  of  equations  and  unknowns  because 
of  the  difficulties  in  computing  the  determinants  D  and  Dj(b).  For 
this  reason,  a  variety  of  oilier  methods  have  been  devised  for 
solving  such  systems.  They  are  given  in  books  on  computational 
mathematics. 


4.  Let  us  return  to  system  (1)  for  arbitrary  m  and  n.  We  assume 
tkat  the  conditions  of  consistency  (2)  hold  true.  Our  aim  will  be 
to  find  all  the  solutions  of  the  system.  The  number  r,  which  is 
equal  to  the  rank  of  matrices  A  and  B  will  be  called  the  rank  of 
the  system  (1).  For  the  sake  of  definiteness,  we  will  assume  that 
the  basis  minor  of  matrix  A  occupies  the  upper  left  corner  (this 
can  always  be  achieved  by  renumbering  the  unknowns  and  the 
positions  of  the  equations).  We  denote  this  minor  by  D : 


Oil  • 

•  a,  r 

Ofl  • 

■  orr 

=5*0 


D  is  a  basis  minor  for  the  matrix  B  as  well,  and  so  the  rows  of 
B  having  numbers  r -f-  1,  ....  m  are  linear  combinations  of  the 
first  r  rows  of  the  matrix  (see  Chapter  I,  Section  7).  This  means 
that  equations  having  numbers  r -f-  1 . m  are  linear  combina¬ 

tions  of  the  first  r  equations  so  that  the  system  (1)  is  equivalent 
to  the  system 


a,,x,  +  •  •  ■  +alnxn  =  b ,  | 
«r i*i  +  •  •  •  +  arnxn  =  br  ) 


(10) 


80 


SYSTEMS  OF  LINEAR  EQUATIONS 


[CII.  Ill 


On  the  left  we  leave  only  those  terms  whose  coefficients  form 
the  basis  minor  D,  and  transpose  all  other  terms  to  the  right: 


fl||-V|  +  •  • 

•  “1”  r%r  r+l^r+1 

•  ■  a\ nxn> 

ar\x  \  +  . 

•  •  ~f"  rxr  rxr  —  br  ar  r+\Xr+i  . 

•  •  —arnxn 

(ID 


We  will  say  that  the  unknowns  xr+\,  . . . ,  xn  are  free,  since  any 
numerical  values  can  be  assigned  to  them.  Then  the  unknowns 
x\,  ....  xr  are  unambiguously  determined  from  system  (11)  by 
Cramer’s  formulas: 


Dl(b  ~  Xr  +  \ar  H  -  ••• 

xl~ - D - 


—  x  a  ) 

n  n> 


(12) 


As  before,  the  a,  here  denote  the  columns  of  the  basic  matrix  A  of 
system  (10)  and  b  denotes  the  column  of  constant  terms  of  the 
system  (10).  The  symbol  D ;  is  determined  by  formula  (9)  in  which 
n  is  replaced  by  r  and  the  vector  b  is  replaced  by  the  vector 

b  '  -^r+I^r+l  -  •  •  XndTi. 

Using  the  properties  of  determinants,  we  expand  the  numerator 
of  (12)  to  obtain 


r  D/W  D/(ar  +  l) 

Xl  D  D  Xr+' 

(/'=  1 . r) 


Dl  (an) 


D  •v" 


Let  us  introduce  the  following  notation: 

Djjb)  __  Dl(~ak) 

Pi  d  ■  <?:/  D 

t=l,  k  =  r+l,...,n 

Then  from  (13)  we  have 

*i  =  Pi  4  <7r+i  \Xr+i  +  •  •  •  +  q-t \Xn, 

Xr  =  Pr  +  qr  +  \rXr  +  \  +  •••  +  Par 

We  adjoin  another  n  — r  obvious  equations: 


,\xn>  | 

\,rxn  ' 


xr+ 1  xr+\ 


(13) 


(14) 


(15) 


(16) 


HOMOGENEOUS  SYSTEMS 


81 


§  5] 


Now  substitute  a  single  vector  equation  lor  all  the  equations  of 
(15)  and  (16): 


*1 

P 1 

Qr  +  l  1 

Qn\ 

Pr 

Pr  +  l  r 

Qnr 

*r+l 

— 

0 

+ 

1 

xr+l  +  ■  •  •  + 

0 

Vf2 

0 

0 

0 

xn 

0 

0 

1 

(17) 


Formula  (17)  yields  a  general  solution  to  the  system  (1)  since  it 
expresses  all  the  unknowns  xu  . . . ,  xr,  xr+i,  ....  xn  in  terms  of 

the  free  unknowns  . . .  to  which  we  can  assign  arbitrary 

numerical  values.  We  will  show  that  all  solutions  of  system  (1) 
are  then  exhausted.  Indeed,  if  X\,  ....  xr ,  vr+i,  x„  is  any  given 

solution  of  (1),  then  xr+, . xn  have  definite  numerical  values. 

Substituting  them  into  (1)  and  repeating  the  previous  manipula¬ 
tions,  we  get  equation  (17). 


5.  Denote  by  x  the  column  In  the  left  member  of  (17),  and  by  p, 

qr+\ . cjn  the  columns  in  the  right  member  of  that  equation  in 

t[]£  order  of  their  arrangement.  Then  (17)  takes  the  form 

x  =  p  +  xr+tqr+l  +  ...  +xnq„  (18) 

Equations  (17)  and  (18)  are  to  be  understood  as  equations 
between  the  vectors  of  the  coordinate  space  Kn- 


6.  Corollary.  If  system  (1)  is  consistent  and  its  rank  r  is  less 
than  the  number  of  unknowns  n,  then  the  system  has  an  infinitude 
of  solutions. 

Remark.  Generally  speaking,  the  choice  of  free  unknowns  may 
be  accomplished  in  different  ways.  However,  not  just  any  collec¬ 
tion  n  —  r  of  unknowns  can  be  taken  as  free  unknowns.  It  is  re¬ 
quired  that  the  coefficients  of  the  remaining  r  unknowns  in  sy¬ 
stem  (1)  form  a  basis  minor  of  the  matrix  A. 


§  5.  Homogeneous  systems 


1.  The  system  of  equations 

aux]  +  . . .  -f  a]nxn  =  0, 

~t~  ■  •  •  “h  c,mnxn  ~  0 


(1) 


82 


SYSTEMS  OF  LINEAR  EQUATIONS 


[CH.  Ill 


is  said  to  be  homogeneous;  here,  the  right  members  of  all  equa¬ 
tions  are  equal  to  zero: 

b\=...=bm  =  0  (2) 

2.  A  homogeneous  system  is  always  consistent.  On  the  one 
hand,  this  follows  from  the  Kronecker-Capelli  theorem:  rank 
B  =  rank  A,  since  the  matrix  B  is  obtained  from  A  by  adjoining 
a  zeroth  column. 

On  the  other  hand,  it  is  immediately  apparent  that  system  (1) 
has  a  zero  solution: 

*,  =  ...  =  xn  =  0 

The  zero  solution  of  a  homogeneous  system  is  said  to  be  a 
trivial  solution.  All  other  solutions  are  termed  nontrivial. 

3.  As  before,  we  will  consider  the  solutions  of  system  (1)  as 
vectors  in  the  coordinate  space  Kn- 

Theorem  1.  The  solution  set  of  a  homogeneous  system  forms  in 
the  space  Kn  a  subspace  of  dimension  n  —  r,  where  r  is  the  rank 
of  the  system. 

Proof.  Due  to  condition  (2), 


Therefore,  in  the  case  at  hand,  p  =  0  and  formula  (18)  of  Sec¬ 
tion  4  expresses  any  solution  x  as  a  linear  combination  of  the 

vectors  qr+ . .  Conversely,  any  linear  combination  of  vectors 

qr+ 1,  ...,  qn  yields  a  solution  of  the  homogeneous  system  (1).  In 
other  words,  the  set  X  of  all  solutions  of  such  a  system  is  a  linear 

hull  of  vectors  qr+\ . qn  in  Kn.  Hence,  X  is  a  linear  subspace 

of  K„. 

We  now  make  sure  that  the  vectors  . . . qn  are  linearly 

independent.  With  this  purpose  in  mind,  let  us  consider  the  matrix 
F  made  up  of  the  coordinates  of  the  vectors  qr+\,  . . . ,  qn- 

Qr  +  l  I  •  •  •  Qn\ 


Qr+ I r  • ■  ■  Qnr 
1  ...  0 

0  ...  1 

The  lower  minor  of  maximum  order  of  the  matrix  T  is  its  basis 
minor  (it  is  equal  to  unity,  that  is  to  say,  it  is  different  from  zero 
and  does  not  have  any  bordering  minors).  Hence,  the  columns  of  T 
are  linearly  independent,  which  means  the  vectors  qr+u  ...,  qn 
are  linearly  independent  as  well. 


HOMOGENEOUS  SYSTEMS 


83 


$  51 

From  the  foregoing  it  follows  that  the  vectors  q,+ \,  .  . . ,  qn  con¬ 
stitute  a  basis  in  X.  But  the  number  of  vectors  is  equal  to  n  —  r, 
hence  X  has  dimension  n  —  r,  and  the  proof  of  the  theorem  is 
complete. 

4.  Let  there  be  given  a  linearly  independent  system  of  solutions, 
n  —  r  in  all: 

cw 

Ci  —  ■  ,  ,  cn—r 

C\n 

Then  any  solution  x  of  system  (1)  can  be  represented  in  the  form 
of  a  linear  combination  of  the  given  solutions  (3): 

*  =  *!<h  +  ...  +t  n-rcn-r  (4) 

Conversely,  any  linear  combination  of  the  form  (4)  yields  a  solu¬ 
tion. 

Both  assertions  follow  immediately  from  the  preceding  theorem. 
Namely,  by  this  theorem  the  subspace  X  of  solutions  of  the  sy¬ 
stem  (1)  has  dimension  n  —  r.  Hence,  the  solutions  cu  ....  c„_, 
constitute  a  basis  in  that  subspace. 

Definition.  Any  linearly  independent  set  of  n  —  r  solutions  is 
«a id  to  be  fundamental  with  respect  to  the  system  of  equations  (l). 

Conclusion.  To  solve  a  homogeneous  system  of  equations  (1), 
it  suffices  to  find  some  fundamental  system  of  its  solutions 
■C\,  ....  c„-r.  Then  all  solutions  of  (1)  are  given  by  (4),  in  which 
each  of  the  parameters  ti,  ....  x„-r  independently  runs  through 
all  possible  numerical  values. 

Remark.  One  of  the  fundamental  systems  of  solutions  is  made 
up  of  the  columns  of  matrix  T.  This  solution  set  is  given  by  for¬ 
mulas  (14)  of  Section  4. 

5.  Example.  Let  us  consider  the  system  of  equations 

x,  +  *2  +  x3  -  *4  =  0,  4 

X2  —  *3  +  *|  =  0  )  (5) 

Here  n  —  4,  r  —  2,  and  so  the  solution  space  has  dimension 
n  —  r  =  2.  Hence  all  we  need  to  do  is  find  some  two  independent 
solutions,  say 

*i  =  0,  *2  =  0.  *3=1.  *4=1; 

*,  =  2,  *2  =  —  1 ,  *3  =  —  1 ,  *4  =  0 


84 


SYSTEMS  OE  LINEAR  EQUATIONS 


[CH.  II! 


whence  we  get  the  general  solution  to  system  (5): 


*1 

0 

2 

-C, 

0 

-  1 

=  *1 

1 

+  *> 

-  1 

*3 

*4 

1 

0 

6.  Important  particular  cases.  (1)  A  homogeneous  system  of  n 
equations  in  n  unknowns: 


0n*i  +  •  ■  •  +  «!„*„  =  0, 

0«l*l  “1“  •  ■  ■  0/m*H  0 


(6) 


Relative  to  system  (6)  we  note  only  one  theorem  which  is  made 
frequent  use  of  in  the  applications  of  linear  algebra. 

Theorem  2.  A  system  of  type  (6)  has  a  nontrivial  solution  if 
and  only  if  its  determinant  is  zero : 

D  —  det  ||  aii  ||  =  0 

Indeed,  in  this  case  and  only  in  this  case  is  r  =  rank  A  <  n 
and  the  dimensionality  of  the  solution  space  positive:  n  —  r  >  0. 

(2)  A  homogeneous  system  of  n  —  1  independent  equations  in 
n  unknowns: 

011*1  +  •••  +0|n*n  =  O, 


an- 1  1*1  +  •  ■  •  +0,i-ln*n  =  O 


The  independence  condition  of  the  equations  means  that  the  (rec¬ 
tangular)  matrix  A  —  ||a,j(|  of  system  (7)  has  rank  r  =  n  —  1.  In 
this  case  the  solution  space  is  one-dimensional  ( n  —  r=  I),  and 
to  obtain  the  general  solution  of  system  (7)  it  suffices  to  find  one 
nontrivial  solution.  This  is  done  as  follows. 

Form  an  auxiliary  square  matrix  A  of  order  n,  which  is  obtained 
from  matrix  A  by  adjoining  a  new  row  at  the  top: 


001 

•  •  a0n 

A  = 

0n 

.  .  Cl\n 

0«-i  l 

• •  an— 1 n 

where  a0i,  •  •  0o„  are  arbitrary  numbers.  Denote  by  A oj  the  co¬ 

factors  of  the  elements  a oj  in  matrix  A.  Then  the  quantities 

*i  =  ^ji>  *2  —  Aj2,  . . . ,  xn  —  Ao„  (8) 

form  a  solution  of  system  (7).  Substitute  the  quantities  (8)  into 
the  i Ih  equation  to  obtain  the  sum  of  the  products  of  the  elements 


HOMOGENEOUS  SYSTEMS 


85 


§  51 

of  one  row  of  matrix  A  into  the  cofactors  of  the  elements  of  ano¬ 
ther  row,  which  sum  is  known  to  be  equal  to  zero: 

an'4ii+  •••  +«mAin  =  0 

Thus,  the  numbers  (8)  satisfy  the  system  (7). 

Denote  by  the  minor  of  matrix  A  of  order  n  —  1  obtained  by 
striking  out  the  j th  column: 


fln 

...  a, 

a\  l  +  l 

•  a\n 

M,= 

a2\ 

•  •  •  a2  l-l 

a2]  + 1 

■  a2n 

an- 1  1 

• ■ ■  an- 1 /-I 

an- 1 /+1  • 

■  an- 1 n 

Then  A0j  =  ( — l)J+lAfj.  Among  the  minors  Afj  there  is  at  least  one 
that  is  nonzero  (the  basis  minor  of  matrix  A).  For  this  reason, 
solution  (8)  is  nontrivial.  The  general  solution  of  system  (7)  may 
be  written  thus: 

xl  —  ( —  l)/+l  AA/X 

where  t  is  an  arbitrary  number.  In  other  words,  the  solution  of 
system  (7)  is  proportional  to  the  minors  of  maximum  order  of  the 
matrix  A  taken  with  alternating  signs.  This  is  sometimes  written 
as  follows: 

x{:x2:x3:  ...  =M, :(—  Af2) :  M3 :  ... 

?.  Earlier,  in  Subsection  3,  we  demonstrated  that  a  homogeneous 
system  of  first-degree  equations  of  rank  r  determines  in  an  n-di- 
mensional  linear  space  Ln  a  subspace  of  dimension  n  —  r.  We  will 
now  show  that  the  converse  is  also  true,  namely  that  the  following 
theorem  is  valid. 

Theorem  3.  Any  subs  pace  of  dimension  k  in  a  space  Ln  with  a 
given  basis  is  a  solution  subspace  of  some  homogeneous  system 
of  linear  equations  of  rank  n  —  k. 

Proof.  Let  there  be  given  in  Ln  a  basis  eiy  . . . ,  en  and  a  sub¬ 
space  Li ,.  In  this  subspace  take  k  independent  vectors  denoted  by 
e'n-k+ 1,  ....  e'n.  Using  the  lemma  of  Subsection  7,  Section  14, 
Chapter  I,  complete  them  to  form  a  basis  in  Ln : 

,  . . . ,  1 ,  •  •  • ,  en  (9) 

The  subspace  Lk  is  a  linear  hull  of  the  vectors  ...,  e'n. 

Therefore  the  vector  x  in  Ln  lies  in  Ltl  if  and  only  if  in  the  basis  (9) 
the  coordinates  with  numbers  1 ,  ....  n  —  A:  are  zero: 

x'i  =  0  (/=  1,  . . n  —  k)  (10) 

Formulas  (10)  constitute  a  system  of  equations  that  determine  Lk 
in  the  basis  (9).  Now  let  us  pass  to  the  original  basis  eu  ....  en. 


86 


SYSTEMS  OF  LINEAR  EQUATIONS 


|CH.  Ill 


To  do  this  we  take  advantage  of  the  formulas  for  transforming 
the  coordinates  (see  Chapter  II,  Section  5): 

n 

A  =  Z  Quxi> 

Whence  we  obtain  the  desired  system  of  equations  (11),  which  is 
equivalent  to  the  system  (10): 

n 

EQ«/*/  =  0.  i=l,  ....  n  —  k  (11) 

Since  the  n  X  rc  matrix  Q  =  IIQqII  is  nonsingular,  all  its  rows 
are  linearly  independent.  Hence,  the  rank  of  system  (11)  is  equal 
to  the  number  of  its  equations:  r  =  n  —  k.  The  proof  of  Theorem  3 
is  complete. 

8.  As  will  be  seen  from  the  foregoing  proof,  the  specific  notation 
of  a  system  of  equations  defining  Lk  depends  on  the  choice  of 
basis. 

Also,  a  given  subspace  may  be  specified  in  a  given  basis  by 
distinct  homogeneous  systems  of  equations.  This  is  clear  since  for 
system  (11)  there  are  an  infinity  of  other  equivalent  systems.  We 
now  show  how  these  systems  can  be  constructed.  Let 

hn  . . .  h\  n-k 

H=\\hn\\= . 

hri—k  1  •  •  •  hn—k  n~k 

be  any  nonsingular  square  matrix  of  order  n  —  k.  Fix  /,  multiply 
equations  (11)  respectively  by  the  numbers  hn  (i  =  1,  . . . ,  n  —  k), 
and  add  them.  Then  write  down  the  resulting  relations  taking 
l  =  1,  . . . ,  n  —  k  to  get  the  homogeneous  system 

n  —  k  n 

5>/<ZQ//*/  =  0  (/=  1 . n-k)  (12) 

i  —  I  /=! 

By  introducing  the  numbers 

n  —  k 

Rt,=  ZhuQu  (1=1,  n-k;  j=  1 . n )  (13) 

;=i 

we  can  write  system  (12)  more  simply: 

tl 

=  0  (1=1,  ....  n-k)  (14) 

System  (14)  is  clearly  a  corollary  to  system  (11).  We  will  now 
show  that  in  turn  system  (11)  is  a  consequence  of  system  (14). 


HOMOGENEOUS  SYSTEMS 


87 


§  5) 

For  each  fixed  /(I  <  /  ^  n),  formula  (13)  may  be  regarded  as  a 
system  of  equations  with  unknowns  Qu-  and  right  members  Rij. 
Solving  this  system  by  Cramer’s  rule  for  each  /,  we  find  that 

Qu  =  "t  Mia  Ra,  (15) 

a=l 

where  the  matrix  \\Hia\\  =  H-'  is  the  inverse  of  H  —  ||/i(i||. 

Formula  (15)  shows  that  the  coefficients  of  system  (11)  are  ex¬ 
pressed  in  terms  of  the  coefficients  of  system  (14)  with  the  aid  of 
the  matrix  ||//,a||  just  as  the  coefficients  of  (14)  were  expressed  in 
terms  of  (11)  with  the  aid  of  the  matrix  ||A|,-||.  Thus,  the  systems 
(11)  and  (14)  are  equivalent,  and  from  the  given  system  (11)  we 
can  obtain  an  infinity  of  equivalent  systems  of  the  form  (12)  for 
the  reason  that  there  are  an  infinity  of  ways  of  choosing  the  non¬ 
singular  matrix  H. 

Remark.  If  the  rectangular  matrices  of  systems  of  equations 
(11)  and  (14)  are  denoted  by  Q  and  R,  respectively,  two  systems 
of  equations  (13)  and  (15)  may  be  replaced  by  the  two  matrix 
equations 

R  =  hq.  q  =  h~'r 

whence  it  follows  that  rank  R  —  rank  Q.  In  other  words,  all 
systems  of  the  form  (14)  that  we  construct  via  the  given  system 
(II)  have  the  same  rank,  n  —  k. 

«  Example.  The  system 

*1  +  *2  —  *3  —  *4  =  0,  | 

*1  —  *2  +  *3  +  *4  =  0  J 

defines  in  four-dimensional  linear  space  a  certain  two-dimensional 
subspace  L2.  Taking 

1  •  1 

"=4  1  -1 

we  obtain  for  (14)  the  system 

x,  =  0,  x2  —  x3  —  x4  =  0 

which  defines  the  same  subspace  L2  as  the  given  system  (in  other 
words,  we  replace  the  given  equations  by  their  half-sum  and  half¬ 
difference). 

9.  We  conclude  this  section  with  proof  that  a  linear  nonsingular 
transformation  of  variables  retains  the  rank  of  the  system  of  equa¬ 
tions  (1).  Write  (1)  in  the  form  of  a  matrix  equation: 

AX  =  Q 


(la) 


88 


SYSTEMS  OF  LINEAR  EQUATIONS 


[CH.  Ill 


where  A  =  |!atJ||  is  an  m  X  n  matrix  of  coefficients,  X  is  the  co¬ 
lumn  matrix  of  the  unknowns  . . . ,  x„,  and  the  zero  in  the  right 
member  denotes  the  zero  m  X  1  matrix.  Let  there  be  given  the 
change  of  variables  X'  =  QX  (see  formulas  (Ilia)  and  (III),  Sec¬ 
tion  5,  Chapter  II;  Q  is  a  nonsingular  n  X  n  matrix).  Then 

X  =  P'X'  (16) 


where  P  =  || P/,/11  =  (Q~1)*.  Substituting  (16)  into  (la),  we  get  the 
following  matrix  notation  for  the  system  of  equations  under  con¬ 
sideration  in  the  new  variables  at],  .  . .,  x'n: 

( AP *)  X'  =  0 

or,  expanded, 


i  =  1 ,  ....  m 


The  matrix  P  is  nonsingular  so  that 

rank  AP *  =  rank  A 


according  to  Subsection  3,  Section  4,  Chapter  II.  This  completes 
the  proof  of  the  assertion  stated  at  the  beginning  of  this  subsec¬ 
tion. 


§  6.  Nonhomogeneous  systems 

I.  Given  a  nonhomogeneous  system 

n 

£  ai,*j  =  bt  (1) 

where  i  =  1,  . . . ,  m  and  among  the  &,■  there  are  nonzero  numbers. 
Assume  that  the  system  is  consistent,  that  is,  that  rank  A  = 
=  rank  B  =  r.  Let  {*f’,  . . .,  x°n}  be  a  solution  of  system  (1).  Sub¬ 
stituting  this  solution  into  system  (1),  we  get  the  identities 

iai/x<j  =  bl  (2) 

Substracl  identities  (2)  from  equations  (1)  to  get  system  (3), 
which  is  equivalent  to  system  (1): 

Put  x i  —  xj  —  it i  and  we  get  the  homogeneous  system 


(3) 


NONHOMOGENEOUS  SYSTEMS 


89 


§61 


Suppose  that  for  the  system  of  equations  (4)  we  know  the  fun¬ 
damental  set  of  solutions 


c\  1 

c  I  = 

C\n 

1 

V. 

I_ 

(5) 


Then,  by  the  results  of  Subsection  4,  Section  5,  any  solution  of 
(4)  can  be  expressed  in  the  form  of  a  linear  combination  of 
vectors  (5),  that  is 


«/  =  T,C,/+  •••  +T n-rCn-r,  (6) 

where  r .  xn~r  are  arbitrary  scalars.  Since  iij  —  Xj  —  x °jt 

from  (6)  we  get 

X/  =  */  +  (V./+  •••  +T  n-rCn-r,)>  /=> . n  (7) 

Let  us  call  x°v  . . . ,  x"  a  particular  solution  of  system  (1).  The 
sum  in  brackets  in  (7)  is  the  general  solution  of  system  (4). 

The  system  (4)  obtained  from  (1)  by  replacing  the  right 
members  by  zeros  is  called  the  homogeneous  system  correspond¬ 
ing  to  system  (1). 

Formula  (7)  shows  that  the  following  theorem  holds  true. 

•  Theorem  1.  The  general  solution  of  a  nonhomogeneous  system 
(1)  is  represented  in  the  form  of  a  sum  of  an  arbitrary  particular 
solution  of  that  system  and  the  general  solution  of  the  correspond¬ 
ing  homogeneous  system. 

2.  A  geometrical  interpretation  of  the  set  of  solutions  of  a  non¬ 
homogeneous  system  of  linear  equations.  We  consider  an  n-dimen- 
sional  affine  space  9l„.  In  it  specify  an  affine  system  of  coordina¬ 
tes.  Then  to  each  solution  *i . xn  of  (1)  we  can  associate  a 

point  of  the  space  9Jn  with  the  coordinates  X\ . xn.  The  follow¬ 

ing  theorem  is  valid. 

Theorem  2.  All  solutions  of  system  (1)  form  in  %n  a  plane  of 
dimension  n  —  r. 

Proof.  All  solutions  of  system  (1)  are  given  by  formula  (7). 
Because  of  the  independence  of  the  vectors  (5),  this  formula  is 
nothing  but  the  parametric  equations  of  a  certain  plane  of  dimen¬ 
sion  n  —  r  (see  Section  3,  Subsection  6).  The  proof  of  Theorem  2 
is  complete. 

Theorem  3.  In  affine  space  91  „  and  in  any  affine  coordinates, 
any  plane  Pm  may  be  specified  by  a  system  of  linear  equations  of 
the  form  ( l )  and  of  rank  r  =  n  —  m. 


so 


SYSTEMS  OF  LINEAR  EQUATIONS 


[OIL  III 


Proof.  Let  the  plane  Pm  pass  through  a  point  A  having  coordi¬ 
nates  ,y'|,  .  . .v‘j  in  the  direction  of  the  subspace  Lm.  Transfer  the 
origin  of  the  affine  system  of  coordinates  to  point  A  preserving 
the  original  basis.  We  denote  the  coordinates  of  the  running  point 
Af  in  the  original  system  by  Y|,  . . . ,  xn  and  in  the  new  system  by 

. . .  The  latter  coincide  with  the  coordinates  of  the  vector 

AM  e  Lm.  By  Theorem  3,  Section  5,  the  subspace  L,„  is  given  by 
a  certain  homogeneous  system  of  linear  equations  of  rank  r  = 
—  n  —  m: 

n 

Y  aijXj  —  0,  i=  1,  . .  r 

/=! 

Taking  into  account  that  JE/  =  y/  —  x°},  we  get 

tail(xl-x°)  =  0 

Putting  bi  =  Yjanx),  we  find  the  system  of  equations 

n 

Y  aUxl  =  bit  i  =  1 . r  (8) 

which  is  of  the  same  rank  r  =  n  —  m  and  defines  P,n  in  the  ori¬ 
ginal  coordinates.  This  completes  the  proof  of  Theorem  3. 

3.  Corollary.  A  plane  is  given  by  a  homogeneous  system  of 
linear  equations  if  and  only  if  it  passes  through  the  coordinate 
origin. 

4.  Important  special  case.  A  hyperplane  is  specified  by  a  single 
linear  equation: 

«i*i  +  o2x2  +  •••  +  anxn  =  b 

5.  Each  one  of  the  equations  in  (8)  can  be  regarded  as  the  equa¬ 
tion  of  some  hyperplane.  For  this  reason,  every  plane  of  dimension 
m  may  be  regarded  as  the  intersection  of  a  certain  number  m  —  n 
of  hyperplaues. 

6.  If  a  system  of  linear  equations  is  inconsistent,  then  this  signi¬ 
fies  geometrically  that  there  is  not  a  single  point  belonging  at 
once  to  all  hyperplanes  given  by  the  equations  of  the  system. 

7.  It  is  quite  obvious  that  when  one  passes  to  new  affine  coor¬ 
dinates  the  form  of  the  equations  (8)  changes.  Besides,  a  given 
plane  P,„  in  a  given  system  of  affine  coordinates  may  be  specified 
by  distinct  systems  of  equations.  This  is  clear  because  there  is  an 


MUTUAL  POSITIONS  OF  PLANES 


91 


§71 

infinity  of  other  equivalent  systems  for  the  system  (8).  Thus,  for 
example,  we  can  take  any  nonsingular  square  matrix  H  =  ||/ifj||, 
i,  j  —  1,  2,  . . . ,  r  and  write  down  the  corollaries  to  the  system  (8): 


Introducing  the  notation 

r  r 

a'il  =  Z  hiacial,  b'i  =  £  h;a’-L 

a=l  a=l 

we  can  write  equations  (9)  more  simply: 

n 

E4/  =  «,  i  =  1 . r  (10) 

/=! 

The  system  of  equations  (10)  not  only  follows  from  the  system  (8), 
but  is  equivalent  to  it  (the  proof  of  this  statement  is  similar  to 
the  proof  of  an  analogous  assertion  in  Subsection  8,  Section  5). 

The  possibility  of  passing  from  system  (8)  to  other  systems  of 
the  form  (10)  signifies  geometrically  that  Pm  may  be  defined  as 
the  intersection  of  distinct  n  —  m  sets  of  independent  hyperplanes. 
The  independence  of  hyperplanes  is  to  be  understood  in  the  sense 
that  the  rank  of  the  consistent  system  (10)  of  equations  of  these 
hyperplanes  is  of  maximum  value,  that  is,  it  is  equal  to  the  number 
q[  equations  (r  —  n  —  m). 

§  7.  Mutual  positions  of  planes 

1.  Intersecting  planes.  Throughout  this  section,  the  dimensions 
of  planes  and  subspaces  will  be  indicated  by  subscripts.  Let  two 
planes  Pi,  and  Pi  in  the  affine  space  Sl„  have  a  common  point  A. 
We  take  this  point  as  the  origin  of  the  affine  system  of  coordinates. 

When  a  running  point  M  ranges  over  the  plane  Ph  (or  P;),  a  vec¬ 

tor  AM  ranges  over  the  subspace  Lh  (or  L{).  Therefore,  the  ques¬ 
tion  of  the  mutual  positions  of  two  intersecting  planes  is  naturally 
connected  with  a  consideration  of  the  subspaces  Lk  and  Lt  in  the 
vector  space  Ln. 

Using  the  properties  of  subspaces  (Chapter  I,  Sections  12  to  14), 
we  can  readily  establish  the  following  facts. 

(1)  If  planes  Ph  and  P/  intersect,  then  their-  intersection  is  a 
certain  plane  P,n  (in  Fig.  10  we  have  k  —  l  =  2,  m  =  1). 

Remark  1.  It  may  happen  that  P,„  consists  of  one  point 

(m  =  0).  This  is  evident  from  the  example  of  two  intersecting 
straight  lines  or  a  straight  line  and  a  plane  (Fig.  11).  In  the 
general  case,  two  planes  can  intersect  in  a  single  point,  the  sum 


92 


SYSTEMS  OF  LINEAR  EQUATIONS 


[CH.  Ill 


of  the  dimensions  of  the  planes  not  exceeding  the  dimension  of  the 
space.  For  instance,  two-dimensional  planes  in  a  four-dimensional 
space. 

Remark  2.  We  do  not  preclude  another  extreme  case  where  one 
of  the  two  planes  lies  entirely  in  the  other.  For  instance,  PhczPi, 
li  <  /,  then  Pm  =  Ph  (in  Fig.  12,  k  =  m  =  1,  /  =  2). 


Fig.  10  Fig.  11 


(2)  If  the  planes  Ph  and  Pi  intersect  along  the  plane  Pm,  then 
there  exists  a  unique  plane  PT  of  dimension  r  =  k  +  /  —  m  which 
contains  Ph  and  Pr,  the  two  planes  Ph  and  Pi  cannot  simultane¬ 
ously  lie  in  any  other  plane  of  smaller  dimension.  The  direction 
subspace  Lr  of  the  plane  PT  is  the  sum  of  the  direction  subspaces 


Fig.  12  Fig.  13 


Lh  and  L,.  This  sum  is  a  direct  sum  if  and  only  if  Ph  and  Pi  in¬ 
tersect  in  a  single  point  (m  =  0,  see  Fig.  13).  In  the  special  case 
k  -f  l  —  m  =  n,  the  role  of  plane  Pr  is  played  by  the  entire  space 
VI „  (for  t  =  a  =  3,  see  Fig.  10). 

(3)  If  the  intersecting  planes  Ph  and  Pt  lie  in  some  plane  Pr, 
then  the  dimension  of  their  intersection  m  k  +  /  —  r.  In  parti¬ 
cular 

m^k  +  l  —  n  (1) 

for  any  two  intersecting  planes  in  Vtn. 


MUTUAL  POSITIONS  OF  PLANES 


93 


5  71 


(4)  If  the  planes  Ph  and  Pi  pass  through  a  point  A  in  the  direc¬ 
tion  of  the  subspaces  Lh  and  Lt  respectively  and  if  Lh  lies  in  Lt, 
then  plane  Ph  lies  in  plane  Pt.  And  if  in  that  case  k  =  /,  then  Ph 
coincides  with  P,  (also,  Lh  coincides  with  Lf). 

2.  Parallel  planes.  Now  let  plane  Ph  be  defined  by  the  point  A 
and  the  subspace  Lh,  and  plane  Pi  by  the  point  B  and  the  sub¬ 
space  Li.  We  assume  that  l  ^  k. 

Definition.  The  plane  Ph  is  parallel  to  the  plane  Pi  if  Lh  c :  Lt. 
We  will  also  allow  for  the  statement,  in  that  case,  that  the  plane 
Pi  is  parallel  to  the  plane  Ph- 

Remark  1.  According  to  this  definition,  the  inclusion  Ph  cz  Pt 
is  a  special  case  of  parallelism. 

Remark  2.  If  Ph  is  parallel  to  Pu  and  k  =  /,  then  Lh  coincides 
with  /.(. 


Fig.  14 

Remark  3.  It  is  easy  to  see  that  for  n  =  3  the  special  cases 
k  =  l  —  1,  k  =  l  —  2,  and  k  =  1,  /  =  2  agree  with  the  notion  of 
parallelism  of  straight  lines  and  planes  in  elementary  geometry 
(Fig.  14). 

Suppose  two  planes  P  and  P'  of  the  same  dimensions  are  given 
in  an  arbitrary  affine  system  of  coordinates  by  systems  of  linear 
equations.  Taking  advantage  of  the  definition  of  parallelism,  we 
can  readily  establish  the  following  assertion. 

For  P  and  P'  to  be  parallel  it  is  necessary  and  sufficient  that 
the  corresponding  homogeneous  systems  of  equations  be  equiva¬ 
lent. 

In  particular,  two  hyperplanes  are  parallel  if  and  only  if  they 
are  given,  in  the  same  coordinates,  by  the  equations 

U\X\  +  •  ■  •  +  anx„  +  b  =  0  (2) 

and 

a\x\  -f-  ...  +  a'nXn  +  b'  =  0  (2X) 

with  proportional  coefficients  of  the  variables: 


f)4  SYSTEMS  OF  LINEAR  EQUATIONS  [CH.  Ill 

Kor  the  hyperplancs  (2)  and  (2')  to  be  coincident,  it  is  necessary 
and  sufficient  that  all  the  coefficients  of  the  equations  be  propor¬ 
tional: 


Theorem  1.  Let  there  be  given  in  the  affine  space  91  „  a  plane  Ph 
and  a  point  B.  Then  there  exists  a  unique  plane  P'k  of  dimension 
k  that  passes  through  point  B  parallel  to  Ph.  If  B  <=  Ph ,  then  P’k 
coincides  with  Ph\  if  the  point  B  lies  outside  Ph,  then  the  planes 
and  P'k  do  not  intersect. 

The  proof  has  been  so  fully  prepared  by  the  foregoing  material 
that  there  is  no  need  to  give  it  here. 

3.  Skew  planes. 

Definition.  Two  planes  are  said  to  be  skew  if  they  do  not  inter¬ 
sect  and  are  not  parallel. 

In  three-dimensional  space  9l3,  we  know  that  two  straight  lines 
(one-dimensional  planes,  that  is)  can  be  skew,  whereas  a  straight 
line  and  a  two-dimensional  plane  in  9l3  cannot  be  skew.  As  the 
dimension  of  a  space  is  increased,  the  space  becomes  more  “roomy” 
and  there  is  more  opportunity  to  construct  skew  planes  of  different 
dimensions  besides  the  one-dimensional  variety.  Theorem  2,  below, 
may  be  regarded  as  a  general  procedure  for  the  construction  of 
skew  planes.  Suppose,  in  the  affine  space  9ln,  we  have  a  plane  Pi 
(/  <  n).  Let  us  take  an  arbitrary  plane  Pk  so  that  Ph  and  Pi  are 
not  parallel  and  intersect;  the  plane  along  which  they  intersect  is 
denoted  by  Pm.  Let  Pr  be  a  plane  of  smaller  dimension  containing 
Pi,  and  Pi.  We  know  that  r  =  k  -f  /  —  m. 

Theorem  2.  If  k  -f  l  —  m  <  n,  then  any  k-dimensional  plane 
parallel  to  Ph  and  not  lying  in  Pr  is  skew  to  P:. 

Corollary.  If  the  integers  k ,  /,  m,  n  satisfy  the  inequalities 

0  ^  m  <  k,  0  ^m<l,  k  +  l  —  m<n 

then  there  exist  in  91  „  the  skew  planes  Ph  and  Pi  with  direction 
subspaces  Lu  and  Li  whose  intersection  Lm  —  Lh  0  L/  has  dimen¬ 
sion  m. 

Proof.  Since  r  =  k  l  —  m  <  n,  plane  Pr  does  not  exhaust  the 
whole  of  space  91  „.  This  enables  us  (with  a  great  deal  of  arbitrari¬ 
ness)  to  take  a  point  C  outside  of  Pr.  Denote  by  P'k  a  plane  of 
dimension  k  passing  through  C  parallel  to  Ph.  It  is  clear  that  P'k 
is  not  contained  in  Pr  and  that  by  selecting  C  in  different  ways  we 
can  obtain  any  /.’-dimensional  plane  to  satisfy  the  hypothesis  of 
the  theorem.  (See  Tig.  15  in  which  k  —  l  —  2,  r  =  3,  n  =  4,  and 
till'  lluee-dimensionai  planes  are  depicted  in  the  form  of  parallele¬ 
pipeds.)  We  will  move  that  the  planes  Pi  and  P'k  are  skew  planes. 


MUTUAL  POSITIONS  OF  PLANLS 


95 


§  ?) 

Note  that  plane  P'k  is  not  parallel  to  Pi,  otherwise  either  Lh  cz  Lt 
or  Li  cz  Lh,  which  is  contrary  to  the  condition  stipulating  the  posi¬ 
tions  of  the  planes  Pi,  and  Pi. 

We  now  prove  that  P*  and  Pi  do  not  intersect.  Drew  through 
point  C  an  auxiliary  r-dimensional  plane  P'r  parallel  to  Pr.  Then 
P'k  cz  P'r  and  therefore  P'k  cannot  intersect  P(  for  then  the  point 
of  their  intersection  would  belong  to  the  parallel  planes  Pr  and  P'r. 
Hence,  P'k  is  skew  to  Pi.  Theorem  2  is  proved. 

Suppose  in  an  n-dimensional  affine  space  21  „  we  have  skew 
planes  Ph  and  P;  with  direction  subspaces  Lk  and  £,,  and  Lh  fl  £,= 
—  Lm,  k  +  l  —  m  <  n. 


Fig.  15 

Theorem  3.  There  exists  a  unique  plane  P,+1  of  dimension 
/•-{-  1  =  (k  -j-  /  —  m)-+-  1  containing  the  planes  Ph  and  Pi. 

Proof.  Choose  an  arbitrary  point  A  e  P/,  and  fix  an  arbitrary 
point  B  in  Pi.  Denote  the  linear  hull  of  the  vector  AB  by  L(AB) 
(Fig.  16).  Suppose  there  is  a  plane  P  containing  Pft  and  Pi.  Let  £ 
be  its  direction  subspace.  Clearly,  £  must  contain  Lh,  Lt  and  L(AB) 
and,  hence,  also  the  sum  of  these  subspaces.  Denote  this  sum  by 

£r+U 

Lr+ 1  =  Lk  +  L i  +  L  (AB)  cz  L 

Conversely,  if  £  is  any  subspace  including  Lr+\,  then  the  plane  P 
that  passes  through  point  A  in  the  direction  of  £  will  contain  Pu 
and  Pi.  Indeed,  since  A  ^  P  and  Lh  cz  £,  it  follows  that  Ph  cr  P. 
Since  A  &  P  and  AB  e  £,  it  follows  that  fieP;  since  SeP  and 
Li  cz  £,  then  Pi  cz  P. 

We  thus  obtain,  from  among  all  planes  P,  the  desired  plane 
Pr+\  of  minimal  dimension  r  +  1  in  the  unique  case  where  Lr+l  is 
taken  for  £.  Let  us  compute  r-f-1.  To  do  this,  consider  L'  — 
=  Lh  +  Li  and  denote  the  dimension  of  L'  by  p.  By  Theorem  3, 
Section  14,  Chapter  1,  we  have  p  =  k  +  l  —  m.  Below  we  will 
show  that  £r+i  =  L'  +  L(AB)  is  a  direct  sum;  and  so  the  dimen- 


SYSTEMS  OF  LINEAR  EQUATIONS 


|CH.  Ill 


90 


sion  of  Lr+ 1  is  equal  to  p  +  1,  that  is  to  say,  (r-f-l)  = 

=  (k  +  r**.m)+  1.  _ 

It  is  thus  necessary  to  establish  that  Lr+i  =  L'  ©  L(AB).  To  do 
so,  it  suffices  to  show  that  the  vector  AB  does  not  belong  to  the 
subspace  L'.  Assume  the  contrary.  Let  AB  e  L'.  Then  by  the  defi¬ 
nition  of  a  sum  of  subspaces  there  exist  vectors  x,  y  such  that 

.vg4,  jei,,,  AB  =  x  +  tJ  (3) 

By  the  first  axiom  of  an  affine  space  there  will  be  a  point  C  such 
that  AC  =  x  and  C  e  Pi,.  By  the  second  axiom  of  an  affine  space, 

.v  +  CB  =  AC  +  CB  =  AB  (4) 

Taking  into  account  (3)  and  (4),  we  find  that 

fiC  =  -//e/w  (5) 

so  that  Ce  Pi.  It  turns  out  that  the  planes  Ph  and  Pi  have  a  com¬ 
mon  point  C,  but  this  is  impossible  since  Pi,  and  Pi  are  skew 
planes.  The  proof  of  Theorem  3  is  complete. 


'W 

- 1 

/ 


Fig.  16 

Remark.  Fig.  16  is  only  a  partial  illustration  of  Theorem  3.  For 
example,  if  the  dimensions  of  P},  and  Pt  exceed  m  and  are  distinct, 
m  ^  1,  /\  1 1  =; *=  P  ¥=  'ln,  then  it  is  easy  to  compute  that  n  ^  7.  It 
is  impossible  to  give  a  complete  drawing  of  such  a  situation.  In 
the  sequel  we  will  frequently  make  use  of  drawings  depicting 
figures  in  low-dimensional  spaces  ( n  =  2,  3,  sometimes  n  =  4)  in 
order  to  illustrate  definitions  and  reasoning  that  refer  to  arbitrary 
n-dimensional  spaces. 

4.  The  foregoing  shows  that  the  planes  Ph  and  Pi  that  are  dealt 
with  in  Theorem  3  are  not  contained  in  any  plane  of  smaller  di¬ 
mension  than  r  -f  I. 

And  so  the  following  theorem  is  valid. 


MUTUAL  POSITIONS  OF  PLANES 


97 


§  7] 


Theorem  4.  If  the  skew  planes  Ph  and  Pi  lie  in  the  plane  Ps, 
then 

s^(k  +  l-m)  +  l  (6) 

(Here,  as  above,  m  is  the  dimension  of  the  intersection  Lh  fl  L /.) 

Corollary.  If  in  91  „  we  have  skew  planes  Ph  and  Pi  of  positive 
dimensions ,  then 

/t<n-  2,  /<«-  2  (7) 

The  inequalities  (7)  follow  from  relation  (6)  for  s  =  n  since  for 
skew  planes  we  have  k  —  m  ^  1,  /  —  m  ^  1. 

Special  case.  A  hyperplane  cannot  be  skew  to  any  other  plane 
of  positive  dimension. 

5.  Retaining  the  notation  of  the  preceding  subsection,  let  us 
state  a  sufficient  condition  for  the  intersection  of  two  planes. 

Theorem  5.  If  in  91  „  are  given  planes  Ph  and  Pt  such  that 

k+ l- m^n  (8) 

where  m  is  the  dimension  of  the  intersection  Lm  of  the  direction 
subspaces  Lh  and  Lt,  then  Pi,  and  Pt  intersect. 

Proof.  Excluding  the  trivial  case  where  one  of  the  given  planes 
coincides  with  the  entire  space,  we  have 

k  <  n,  l  <  n  (9) 

*  Only  three  possibilities  are  permissible  for  the  positions  of  the 
two  given  planes: 

Ph  is  parallel  to  Pt,  or 

Ph  and  Pi  are  skew  planes,  or 

Ph  and  Pi  intersect. 

If  Ph  is  parallel  to  Pi,  then  for  the  dimension  m  of  the  intersec¬ 
tion  of  the  corresponding  subspaces  Lh  and  L,  we  have 

m  =  min(&, /)  (10) 

and  relations  (9)  and  (10)  contradict  inequality  (8).  If  Ph  and  Pt 
are  skew  to  each  other,  inequality  (6)  holds  for  s  =  n,  which  is 
again  a  contradiction  relative  to  (8).  We  are  thus  left  with  the 
assumption  that  Ph  and  Pi  are  skew  to  each  other.  Theorem  5  is 
proved. 

Remark.  It  is  easy  to  demonstrate  that  under  the  hypothesis  of 
Theorem  5  the  equation  k  +  l  —  m  =  n  actually  holds  true.  How¬ 
ever,  in  estimations  it  is  easier  to  verify  an  inequality  than  an 
equation,  and  so  we  state  the  sufficient  condition  for  intersection 
of  planes  in  the  form  of  an  inequality  (8). 


4—  Ml 


98 


SYSTEMS  OF  LINEAR  EQUATIONS 


[CH.  Ill 


6.  Wo  now  turn  to  an  algebraic  interpretation  of  the  theorem  on 
the  intersection  of  planes. 

Suppose  we  have  two  nonhomogeneous  and  separately  consistent 
systems  of  linear  equations  whose  ranks  are  equal  to  rs  and  r2.  We 
combine  these  two  systems;  that  is,  we  regard  all  the  equations 
jointly.  For  this  combined  system  we  construct  a  corresponding 
homogeneous  system  and  denote  its  rank  by  r0. 

If  r0  ^  r |  -(-  r2>  then  the  combined  nonhomogeneous  system  of 
equations  is  consistent. 

Indeed,  if  the  given  system  of  equations  determine  the  planes  Pu 
and  Pi,  then  the  homogeneous  system  corresponding  to  the  union 
of  the  given  systems  determines  Lm  =  Lh  f)  L,.  We  accordingly 
have  h  —  n  —  ru  l  —  n  —  r2,  m  =  n  —  r0.  Thus,  k  +  /  —  m  = 
=  n  -j-  To—  (ri  -f-  r2)  ^  n  and,  hence,  the  planes  Ph  and  Pt  inter¬ 
sect,  which  signifies  that  the  combined  system  is  consistent. 

By  way  of  an  exercise,  we  leave  it  to  the  reader  to  prove  in 
purely  algebraic  fashion  the  assertion  stated  in  this  subsection, 
relying  not  on  Theorem  5  but  on  the  Kronecker-Capelli  theorem, 
and  to  verify  at  the  same  time  that  the  inequality  r,  -j-  r2  ^  r0 
actually  implies  the  equation  rt  -f  r2  =  r0. 

§  8.  Systems  of  linear  inequalities  and  convex  polyhedrons 

1.  In  this  section  we  consider  a  real  n-dimensional  affine  space 
2l„,  assuming  as  given  an  affine  coordinate  system. 

2.  Suppose  a  straight  line  through  a  point  X0  e  2t„  having  co¬ 
ordinates  (x'l . a^)  is  drawn  in  the  direction  of  a  vector  /, 

whose  coordinates  we  denote  by  {/| . /„}.  By  Subsection  6,  Sec¬ 

tion  3,  this  straight  line  may  be  specified  by  the  parametric  equa¬ 
tions 

xi  =x°i  +  x/.,  i=\,  ....  n,  (I) 

—  oo  <  T  <  -f  oo 

Let  certain  points  A  and  B  be  chosen  on  line  (1).  The  corres¬ 
ponding  values  of  the  parameter  t  will  be  denoted  by  ti  and  t2. 
Suppose  t j  <  t2. 

Definition.  The  set  of  points  of  the  line  that  satisfy  the  inequa¬ 
lities 

t,  <  t  <  t2 


is  called  a  line  segment  AB. 

3.  If  the  point  A  has  coordinates  (a,,  ....  a„)  and  the  point  B 
has  coordinates  (/;,,  ...,  bn),  then  for  the  direction  vector  of  the 


§8]  LINEAR  INEQUALITIES.  CONVEX  POLYHEDRONS  99 

line  we  can  take  the  vector  /  =  AB.  Then  lt  =  bi  —  a{,  and  for  the 
running  point  of  the  line  we  have 

Xi  =  a i  +  (bi  —  a,)  t  =  ( I  —  t)  at  -f  r bt 

We  have  t  =  0  at  A  and  x  =  1  at  B  so  that  the  line  segment  AB 
is  now  given  by  the  inequalities  0  ^  t  ^  1.  Set  1  —  t  =  a,  x  =  p. 
Then  for  the  points  of  the  line  segment  AB,  and  only  for  them,  we 
have 

xt  =  a  a,  -|-  p  bi  (2) 

/=1,  n,  a^O,  0^0,  a  +  P  =  1 

The  point  at  which  a=p  =  -y  is  called  the  midpoint  of  the  line 
segment  AB. 

4.  Definition.  A  set  of  points  of  real  affine  space  is  said  to  be 
convex  if  together  with  every  two  of  its  points  A,  B  it  also  contains 
the  line  segment  AB. 

The  most  elementary  instances  of  convex  sets  are  a  line  seg¬ 
ment,  a  plane  of  arbitrary  dimension,  the  entire  space  9l„. 

The  set  consisting  of  one  point  and  the  empty  set  are  also  re¬ 
garded  as  convex  sets. 

From  the  definition  it  follows  immediately  that  the  intersection 
of  any  collection  of  convex  sets  is  itself  a  convex  set.  Indeed,  if 
m  points  A,  B  belong  to  the  intersection  of  some  collection  of  convex 
sets,  then  the  line  segment  AB  belongs  to  each  of  these  sets,  and 
hence  to  their  intersection. 

5.  Given  in  the  space  9l„  an  arbitrary  hyperplane 

A|.V|  +  . .  ■  +  A:lxn  -f-  C  —  0  (3) 

The  hyperplane  (3)  divides  the  space  into  two  parts  called  open 
half-spaces.  Their  points  are  described  by  the  inequalities 

YjAiXi  +  C  <0  and  X/l^-t-OO  (4) 

respectively.  By  adjoining  the  hyperplane  (3)  to  an  open  half¬ 
space  we  get  what  is  called  a  closed  half-space.  One  of  them 
consists  of  points  whose  coordinates  satisfy  the  inequality 

S  AiXi  +  C  <  0  (5) 

the  other,  of  points  whose  coordinates  satisfy  the  inequality 

EM  +  C>fl  (6) 

6.  It  is  an  essential  fact  that  the  space  at  hand  is  a  real  space. 
In  the  complex  case,  no  hyperplane  divides  the  space,  just  as  no 
straight  line  can  divide  three-dimensional  real  space.  This  means 


4* 


100 


SYSTEMS  OF  LINEAR  EQUATIONS 


[CM.  Ill 


that  if  the  points  A  and  B  in  complex  space  do  not  belong  to  a 
hyperplane,  then  they  can  be  joined  by  a  line  that  does  not  inter¬ 
sect  this  hyperplane.  Contrariwise,  if  in  real  space  the  points  A 
and  B  belong  to  two  different  open  half-spaces  (4),  then  any  curve 
joining  A  and  B  intersects  the  hyperplane  (3).  We  omit  the  proof. 

7.  Theorem  1.  Every  half-space  is  a  convex  set. 

We  carry  out  the  proof  for  the  half-space  (5).  Let  the  points  A 
and  B  belong  to  this  half-space.  Then 

Y  A  fit  +  C  <  0,  Y  A,b,  +  C  <  0 

and  for  an  arbitrary  point  X  of  the  line  segment  AB,  taking  into 
account  (2),  we  get 

Y  AlXi  -f  C  =  X  At  (aat  +  pb,)  +  C  (a  -f  P) 

=  a(L  Afit  +  C)  +  p(£  A,b,  +  C)< 0 

Thus,  the  point  X  belongs  to  the  half-space  (5).  But  X  was  chosen 
arbitrarily  on  AB  and  so  the  entire  line  segment  AB  belongs  to 
the  half-space  (5),  which  is  what  we  set  out  to  prove. 

8.  Definition.  The  intersection  (if  it  is  not  empty)  of  a  finite 
number  of  half-spaces  is  a  convex  polyhedron. 


We  coniine  ourselves  to  polyhedrons  formed  by  the  intersection 
of  closed  half-spaces. 

Pictorial  1  y,  a  convex  polyhedron  is  a  piece  of  space  cut  out  by 
several  hyperplanes  (for  n  =  3  see  Fig.  17).  This  piece  may  extend 
to  infinity  (Fig.  18).  It  may  also  occur  that  the  polyhedron  lies 
entirely  in  some  /("-dimensional  plane  k  <  n  (see  Fig.  19  for 
n  =  3,  It  =  2 ) . 

If  we  have  in  half-spaces  given  by  the  inequalities 

il 

Y  Ai/Xj  -f  Ct  <0,  i  =  1 . m 

i -i 


(7) 


S  81 


LINEAR  INEQUALITIES.  CONVEX  POLYHEDRONS 


101 


then  the  polyhedron  (the  intersection  of  the  half-spaces  (7))  in¬ 
cludes  those  and  only  those  points  whose  coordinates  satisfy  the 
system  of  inequalities 


AuX\  -f  . . .  +  Alnxn  -f  C,  ^  0, 
Am\X |  +  . . .  +  Amilx„  +  Cm  ^  0 


(8) 


On  the  other  hand,  if  a  system  of  type  (8)  is  consistent,  then  it 
determines  a  polyhedron  formed  by  the  intersection  of  the  half¬ 
spaces  (7). 


Remark.  It  is  clear  that  an  inequality  of  type  (6)  can  always  be 
replaced  by  an  inequality  of  type  (7)  by  multiplying  all  its  coef¬ 
ficients  by  ( — 1). 


9.  A  polyhedron  is  called  an  n-dimensional  parallelepiped  if  in 
some  affine  coordinate  system  it  is  specified  by  inequalities  of  the 
form 

1/  ^  x i  r}/,  i ==  1 ,  •  • .  >  ti  (9) 

where  g,-,  t]*  are  scalars.  In  particular,  we  say  that  the  parallele¬ 
piped  is  constructed  on  the  independent  vectors  eit  . . . ,  e„  applied 
to  the  point  0  if  it  is  specified  by  the  inequalities 

0<*f<l,  (10) 

in  coordinates  with  origin  0  and  basis  e\ . e„. 

The  inequalities  (9)  can  always  be  reduced  to  the  form  (10)  by 
means  of  a  transformation  of  the  affine  coordinates. 

When  n  —  1,  an  N-dimensional  parallelepiped  is  a  line  segment, 
when  n  =  2,  it  is  a  parallelogram. 

That  portion  of  the  parallelepiped  (10)  located  in  one  of  the 
hyperplanes  x,  =  0  or  *,•  =  1  is  itself  an  (n —  l)-dimensional  pa¬ 
rallelepiped  and  is  called  the  (n  —  1 ) -dimensional  face  of  the  pa¬ 
rallelepiped  (10).  We  can  also  consider  the  faces  of  these  ( n — 1)- 
dimensional  parallelepipeds,  the  faces  of  their  faces,  and  ?o  forth. 
We  thus  obtain  a  collection  of  /e-dimensional  parallelepipeds  of 
different  dimensions  k,  n—  1  ^  k  ^  1.  They  are  all  called  ft-di- 
mensional  faces  of  the  original  parallelepiped  (10).  The  one-di¬ 
mensional  faces  are  called  edges  and  their  extremities  are  called 


102 


SYSTEMS  OF  LINEAR  EQUATIONS 


[CH.  Ill 


vertices  of  the  parallelepiped.  It  can  be  shown  that  the  vertices  of 
parallelepiped  (10)  are  points,  and  only  those  points,  each  of  the 
coordinates  of  which  is  either  zero  or  unity. 

Example.  In  three-dimensional  Euclidean  space  with  a  specified 
rectangular  Cartesian  coordinate  system  (x,  y,  z),  let  us  consider 
the  rectangular  parallelepipeds  whose  edges  are  parallel  to  the 
coordinate  axes.  Let  (x0,  y0,  z0)  be  the  coordinates  of  the  centre 
of  the  parallelepiped,  and  a ,  b,  c,  the  lengths  of  the  edges  parallel 
to  the  axes  x,  y,  z  respectively.  Denote  by  si  the  set  of  parallele¬ 
pipeds  of  this  kind  whose  centres  lie  in  the  cube  |,  |</|^  £, 

|z|  ^  |,  and  the  lengths  of  the  edges  do  not  exceed  tj.  To  each  pa¬ 
rallelepiped  of  the  set  si  there  can  be  associated  a  point  of  a  six¬ 


dimensional  affine  space  Vl6  with  coordinates  (x0,  yo,  Zo,  a,  b,  c). 
Then  the  set  si  itself  can  be  regarded  as  a  six-dimensional  paral¬ 
lelepiped: 

-£<Zo<i, 

O^fl^T),  0^6  ^  T],  O^C^T) 


Note  that  geometric  figures  of  one  space  are  often  conveniently 
regarded  as  points  of  another  space. 


10.  Definition.  A  set  of  points  in  the  affine  space  Vl„  is  said  to 
be  bounded  if  the  coordinates  of  all  the  points  of  the  set  satisfy 
the  inequality  | .v, |  sC  Af  (M  a  number  greater  than  zero). 

It  is  easy  to  verify,  through  the  use  of  the  formulas  of  Section  2, 
that  this  definition  does  not  depend  on  the  choice  of  the  affine  co¬ 
ordinate  system.  A  set  is  bounded  if  and  only  if  it  is  contained  in 
a  certain  parallelepiped. 


II.  Definition.  The  convex  hull  of  a  set  si  of  points  in  the  affine 
space  VI  is  defined  as  that  convex  set  si  cr  VI  which  is  contained  in 
any  convex  set  containing  s&. 

In  other  words,  the  convex  hull  sd-  is  the  intersection  of  all  pos¬ 
sible  convex  sets  containing  the  given  set  s4-.  We  also  say  that  st 
is  the  smallest  convex  set  containing  si  (Fig.  20). 


§  8)  LINEAR  INEQUALITIES.  CONVEX  POLYHEDRONS  103 

Example.  The  convex  hull  of  two  points  A,  B  is  the  line  seg¬ 
ment  AB. 

It  can  be  proved  that  the  convex  hull  of  any  finite  number  of 
points  is  a  bounded  convex  polyhedron  and  that  any  bounded  con¬ 
vex  polyhedron  of  type  (8)  is  a  convex  hull  of  some  finite  system 
of  points  called  its  vertices. 

12.  We  show  a  geometric  construction  that  is  frequently  found 
to  be  useful  in  dealing  with  convex  hulls. 

Given  a  convex  set  s4-  and  a  point  M.  Construct  all  possible  line 
segments  of  the  form  MX,  X  e  s4-\  denote  the  set  of  points  of  all 
such  line  segments  by  &  (Fig.  21).  Then  the  following  theorem 
holds. 


M  m 


X 

Fig.  21  Fig.  22 


Theorem  2.  The  set  is  a  convex  hull  of  the  union  iUM. 

Proof.  If  then  &  =  and  the  assertion  of  the  theorem 

is  obvious.  Suppose  M  does  not  belong  to  the  set  .saC  Any  convex 
set  containing  i  U  M  must  contain  all  Therefore  it  suffices  to 
verify  the  convexity  of  3S.  Let  the  points  A,  B  e  Then  A  lies  on 
some  line  segment  MX  and  B  lies  on  some  line  segment  MY, 
where  X,  Y  e  si-  (Fig.  22).  We  have  to  establish  that  the  line 
segment  AB  lies  entirely  in  the  set  Let  C  be  an  arbitrary  point 
of  AB.  Then  (if  we  exclude  the  trivial  cases  where  one  of  the 
points  A,  B  coincides  with  one  of  the  points  M,  X,  Y)  we  have 

~MA  =  XMX,  0  <  X  <  1 , 

MB  =  \iMY,  0  <  p  <  1; 

MC  —  aMA  -f  pATC,  a>0,  p>0,  a-f§  =  l 

There  will  be  a  point  Z  on  XY  such  that 

MZ  =  —  ALT  +  MY 

V  '  V 


101 


SYSTEMS  OF  LINEAR  EQUATIONS 


ten.  m 


whore  v  —  ah  -f-  Pn,  0  <  v  <  1.  The  point  Z  is  contained  in  the 
set  si  since  the  latter  is  convex.  It  is  readily  seen  that 

MC=  vMZ 

which  means  that  the  point  C  belongs  to  the  line  segment 
MZ  cl,  and  this  concludes  the  proof  of  Theorem  2. 

13.  We  note  some  elementary  properties  of  a  convex  hull. 

(1)  A  set  si  coincides  with  its  convex  hull  if  and  only  if  it  is 
convex. 

(2)  If  si\  cr  si2,  then  the  convex  hull  of  the  set  si\  is  contained 
in  the  convex  hull  of  the  set  si2. 

These  two  properties  follow  directly  from  the  definitions  of  a 
convex  set  and  a  convex  hull. 

(3)  Let  si  =  si{\j  si2  and  let  six  be  the  convex  hull  of  the 
set  si{.  Then  the  convex  huh  si  of  the  set  si  coincides  with 
the  convex  hull  of  the  union  si[  U  & i ■ 

Proof.  si iU^cc  *s^i_U  si2.  Therefore  si-  is  contained  in  the 
convex  hull  of  the  set  stf-\  \}si2.  On  the  other  hand,  si  is  a  con¬ 
vex  set  that  contains  si\  and  si,.  Therefore  the  convex  hull  of 
the  union  six[)si2  is  contained  in  si.  Thus,  the  set  si  and  the 
convex  hull  of  the  set  si^\]si2  coincide. 

14.  Given,  in  an  affine  space  21,  the  points  A0,  A i,  . . . ,  Ap  with 
radius  vectors  a 0,  ait  ....  ap  respectively.  The  following  theorem 
holds. 

Theorem  3.  The  convex  hull  of  the  system  of  points  A0,  A\, . . . ,  A,, 
is  given  by  the  formula 

x  =  aoa0  +  a,a,  +  ...  +  a„ap  (11) 

where  x  is  the  radius  vector  of  an  arbitrary  point  in  the  convex 
hull,  and  the  numbers  ao,  .  . . ,  ap  satisfy  the  conditions 

a0-ha,  +  ...  +«„=!,  ) 

ao>0,  a,>0 . ap>0  j 

The  proof  is  carried  out  by  means  of  induction  on  the  number  of 
points.  Theorem  3  holds  true  for  two  points  since  for  p  =  1  for¬ 
mulas  (11)  and  (12)  specify  the  line  segment  AoAi. 

Let  Theorem  3  be  proved  for  p  +  1  points.  Consider  the  points 

A0 . Denote  their  convex  hull  by  si.  Add  another  point 

Ar  |,  with  radius  vector  ap+i  and  construct  the  convex  hull  3B  of 
tlie  union  si  U  si  v  H.  By  Property  3  of  Subsection  13,  the  set 
coincides  with  the  convex  hull  of  the  system  of  points  A0,  Alt  ..., 


LINEAR  INEQUALITIES.  CONVEX  POLYHEDRONS 


105 


§  81 


Ap,  Ap+].  By  Theorem  2,  the  set  13  consists  of  all  possible  line 
segments  Ap+ \X,  where  X  e  si-.  The  radius  vector  x  of  point  X  is 
given  by  equation  (11).  Denote  by  y  the  radius  vector  of  an  ar¬ 
bitrary  point  of  the  line  segment  AJ)+iX.  Then 

y  =  ax  +  PVh,  a-fp=l,  a>0,  P>0  (13) 

Put 

Pi  =  acq ,  «  =  0,  ....  p\  PP+i  =  P  (14) 

From  formulas  (11)  to  (14),  we  get 

y  —  Pjflo+  •••  +  Ppflp  +  P0--iap+i 

Pj  +  •  •  •  +  Pp  +  Pp+ 1  =  1 ,  P/  ^  0,  /  =  0,  . . . ,  p  +  1  J 

Thus,  every  point  of  the  set  <3  satisfies  the  relations  (15).  Con¬ 
versely,  substituting  the  quantities  (14)  into  (15),  we  get  for  y 
an  expression  of  type  (13),  where  x  satisfies  the  relations  (11) 
and  (12).  This  means  that  every  point  that  satisfies  conditions 
(15)  belongs  to  the  set  <3,  which  completes  the  proof  of  Theorem  3. 

15.  Definition.  The  convex  hull  of  a  set  of  points  A0,  A  i,  . .  . ,  Ar 
lying  in  the  general  position  is  called  an  r-dimensional  simplex 
with  vertices  A0,  A ... ,  Ar. 

From  Theorem  3  it  follows  that  a  simplex  with  vertices 
A0,  .  .  . ,  Ar  is  specified  by  the  formulas  (11)  and  (12)  for  p  —  r. 
Here,  the  numbers  ao . ar  are  called  the  barycentric  coordina¬ 

tes  of  the  point  of  the  simplex  having  the  radius  vector  x. 
Particular  cases: 

a  zero-dimensional  simplex  is  a  single  point, 
a  one-dimensional  simplex  is  a  line  segment, 
a  two-dimensional  simplex  is  a  triangle, 
a  three-dimensional  simplex  is  a  triangular  pyramid. 

The  point  of  a  simplex  at  which  all  barycentric  coordinates  are 
equal  (do  =  ...  —ar  —  yqrp)  >s  called  the  centre  of  the  simplex. 

Let  Tr  be  a  simplex  with  vertices  A0,  A,, . . . ,  Ar  and  let  Ai  ,  Ai  , . . . , 
Aik  be  certain  of  its  vertices.  The  ^-dimensional  simplex  which 
is  a  convex  hull  of  the  vertices  Aig,  Ai{>  ...,  Aik  is  called  a  k-di- 
mensional  face  of  the  simplex  TV. 

One-dimensional  faces,  that  is  to  say,  line  segments  joining  ver¬ 
tices,  are  called  the  edges  of  the  simplex. 

Two  faces  of  dimensions  k  and  r—  (&  +  1)  are  called  opposite 
faces  of  the  simplex  Tr  if  they  do  not  have  any  vertices  in  common. 
As  an  exercise,  the  reader  is  advised  to  prove  that  a  simplex  is  a 
convex  hull  of  a  pair  of  opposite  faces,  that  the  opposite  faces  of 


SYSTEMS  OF  LINEAR  EQUATIONS 


iCH.  Ill 


Ton 


£i  simplex  always  lie  in  skew  planes,  and  that  the  line  segment 
joining  the  centres  of  opposite  faces  passes  through  the  centre  of 
the  simplex. 


16.  We  will  prove  that  an  n-dimensional  simplex  in  n-dimen- 
sional  space  is  the  intersection  of  n  +  1  closed  subspaces. 

Let  A0,  A . An  be  the  vertices  of  the  simplex  Tn.  Take  A0  for 

the  coordinate  origin  and  choose  the  basis  as  follows: 

e\  =  AnA. ,  e,  =  A0A, . en  =  A3An 


Then  the  relations  (11)  and  (12)  (for  p  =  n)  assume,  in  coordi¬ 
nates,  the  form 

*i  —  a,, 
x2  =  a2. 


xn  a„ , 

a0  +  al  +  •  •  •  +0/1=1. 

a0^0,  cti  ^  0,  ...,  a„^0 


(16) 


whence  it  follows  that 


x,^0,  *2>0,  ...,  xn^0,  ) 

x\  +  x‘l  +  •  •  •  +  xn  ^1  1  ^  ^ 

On  the  other  hand,  (17)  implies  (16)  if  we  put  at  =  xt  for 
r  ==  1 . n,  a0  —  1  —  (*i  +  x2  +  . . .  +  xn).  Thus,  the  systems 


(16)  and  (17)  are  equivalent  and  specify  one  and  the  same 
simplex  T„  (for  n  =  3  see  Fig.  23). 

The  system  of  inequalities  (17)  shows  via  the  intersection  of 
which  half-spaces  (he  simplex  Tn  is  formed. 

17.  We  have  already  mentioned  that  a  polyhedron  can  be  pic¬ 
tured  as  a  piece  of  space  cut  out  by  several  hyperplanes.  It  can  be 
proved  that  if  a  polyhedron  is  bounded,  then  the  number  m  of  the 


§  8|  LINEAR  INEQUALITIES.  CONVEX  POLYHEDRONS  107 

cutting  hyperplanes,  that  is,  the  number  of  half-spaces  the  inter¬ 
section  of  which  forms  the  polyhedron,  must  exceed  the  dimensio¬ 
nality  of  the  space. 

The  simplex  corresponds  to  the  smallest  possible  number  m  = 
=  n  +  1. 

18.  Let  a  polyhedron  be  specified  by  a  system  of  inequalities  of 
type  (8),  and  suppose  there  is  a  function 

z  =  cu  +  cixl  4-  ...  +cnxn  (18) 

where  c0,  c i,  . . . ,  cn  are  numerical  coefficients,  and  . . .  xn  are 

the  coordinates  of  the  point  in  '21, 

The  problem  of  finding  the  maximum  and  minimum  of  function 
(18)  on  the  polyhedron  (8)  is  of  such  great  importance  in  applica¬ 
tions  (economics,  for  example)  that  the  investigation  of  this  pro¬ 
blem  and  the  development  of  numerical  methods  for  its  solution 
now  constitute  a  whole  field  of  research  called  linear  program¬ 
ming. 

Note  on  the  other  hand  that  the  geometrical  theory  of  convex 
polyhedrons  is  a  substantial  aid  to  the  algebraic  theory  of  linear 
inequalities. 


Chapter  IV 


LINEAR,  BILINEAR 
AND  QUADRATIC  FORMS 


§  1.  Linear  forms 

1.  Suppose  that  in  a  linear  space  L  is  given  a  numerical  func¬ 
tion  of  a  vector  argument,  that  is,  to  every  vector  x  there  is  as¬ 
sociated  a  number  a(x). 

In  this  chapter  we  regard  the  function  a(x)  in  the  generally 
accepted  manner,  namely  we  consider  it  invariant.  This  means 
that  the  value  of  a(x)  does  not  depend  on  the  choice  of  basis  in 
the  space  L. 

Remark.  In  some  of  the  chapters  later  on  we  will  have  to  give 
up  this  generally  accepted  viewpoint  and  regard  functions  whose 
numerical  value  is  determined  for  a  given  x^L  (or  for  given 
x,  y,  . . .  e  L)  by  means  of  a  basis  in  L  and  may  depend  on  the 
choice  of  basis.  Incidentally,  in  this  case  as  well  we  can  revert  to 
the  generally  accepted  viewpoint  by  extending  the  concept  of  the 
domain  of  definition  of  a  function.  Indeed,  if  by  <S  we  denote  the 
set  of  all  bases  of  space  L,  then  we  can  consider  the  function 
a(x ,  e),  where  x  e  L,  e^<g.  We  obtain  the  ordinary  (invariant) 
function  a{x)  if  a(x ,  e)  =  a(x)  for  all 

Definition.  A  function  a(x)  is  said  to  be  linear  if: 

(1)  a(x  -f  //)  =  a (*)-(-  a(y)  for  any  vectors  x,  y  in  L; 

(2)  u(ax)—  a a(x)  for  any  scalar  a  and  any  vector  x  in  L. 

For  values  of  the  function  a(x)  we  will  take  real  numbers  if  L 

is  real  and  we  will  admit  complex  numbers  if  L  is  complex. 

2.  Examples.  (1)  Let  x  —  x,ei  -f  . . .  -f  xnen,  where  eu  <?« 
is  a  basis  in  L.  In  each  basis  et,  . . . ,  en  put  a(x)~  xx.  Then  the 
properties  (1)  and  (2)  of  Subsection  1  hold  true  for  a(x),  but  a(x) 
does  not  satisfy  the  definition  of  a  linear  function  since  it  depends 
on  the  chosen  basis. 

(2)  Let  L  be  the  space  of  polynomials  of  degree  not  exceeding  n. 
Let  every  polynomial  x(t)  in  L  be  associated  with  a  number  a(x ) 


LINEAR  FORMS 


109 


§  11 

by  the  formula 

a  (x)  =  jj  *  (x)  rfx  (1) 

T, 

where  ti  x  ^  X2  is  a  given  interval  on  the  number  axis.  It  is 
clear  that  the  numerical  value  of  a(.v)  does  not  depend  on  the 
choice  of  basis  in  L.  Conditions  (1)  and  (2)  of  Subsection  1  hold 
due  to  the  familiar  properties  of  a  definite  integral.  Thus,  function 
(1)  is  a  linear  function  in  the  space  L. 

Remark.  The  linear  function  (1)  may  also  be  considered  in  the 
infinite-dimensional  space  of  continuous  functions  specified  on  an 
arbitrarily  chosen  interval  [x',  x']  subject  to  the  condition  that 
x'^x,,  x'^x2  or  in  the  space  of  all  functions  integrable  on 
[x',  x']  (this  too  is  an  infinite-dimensional  space). 

3.  Given  in  space  L  a  linear  function  a(x).  Assuming  that  L  is 
n-dimensional,  fix  in  it  an  arbitrary  basis  eu  ....  en  and  expand 
the  vector  x  in  terms  of  this  basis:  x  —  Xi^i  +  . . .  -f  xnen.  Then 
the  linear  function  will  be  written  as 

a  (x)  =  a  (x,e{  +  ...  +  xnen)  =  x{a  (<?,)  -f  •••  +xna(en)  (2) 

Denote  by  a,-  the  value  of  the  function  a(x)  on  the  basis  vector 

a,=a(e,),  ....  an  =  a(en)  (3) 

If  the  basis  is  fixed,  then  a,-  represent  quite  definite  numbers.  Sub¬ 
stituting  the  quantities  (3)  into  (2),  we  get  an  expression  of  the 
function  a(x)  in  the  form  of  a  homogeneous  polynomial  of  first 
degree  in  the  components  (coordinates)  of  the  vector  x\ 

a  (x)  =  a,.x,  -f  a,x 2  +  . . .  +  anxn  (4) 

4.  Homogeneous  polynomials  of  degree  k  are  generally  called 
forms  of  degree  k.  For  k  =  1  we  have  the  term  linear  form ,  for 
k  —  2,  the  quadratic  form. 

According  to  formula  (4),  every  linear  function  a(x)  in  n-dimen- 
sional  linear  space  is  a  linear  form  in  the  components  of  its  argu¬ 
ment  x. 

In  this  connection,  it  is  usual  to  call  linear  functions  linear 
forms. 

5.  In  the  space  Ln  we  pass  to  a  new  basis  e\,  ...,  e'n  by  the 
formula 

=  Z  Pifi 


(I) 


110 


I. INTAR.  nil. INEAR  AND  QUADRATIC  FORMS 


|CII.  IV 


(see  Section  5,  Chapter  II).  In  the  new  basis,  the  linear  form  will 
have  new  coefficients  a[\ 

a  (x)  —  a\x'i  -f-  a'2x'2  -j-  a'nx'n 

Let  us  find  tiie  a\  using  the  fact  that  these  numbers  are  the  values 
of  the  form  a(x)  on  the  new  basis  vectors: 

a\  =  a  (e\) 

Using  expression  (I)  for  the  vectors  e\  and  taking  advantage  of 
the  linearity  of  the  function  a(x),  we  find 

<  =  a  (Z  V/)  =  £  Pna  iei)  =  £  Paai 

Thus 

<  =  Z^V/  (5) 

We  see  that  (5)  is  fully  analogous  to  (I). 

6.  We  will  now  prove  that  the  law  of  transformation  of  the  coef¬ 
ficients  as  expressed  by  formula  (5)  ensures  the  invariance  of  the 

values  of  the  function,  which  in  the  basis  e\ . en  is  given 

by  (4). 

For  this  purpose  let  us  use  the  formulas  (Ill)  and  (4)  of  Sec¬ 
tion  5,  Chapter  II.  In  the  new  basis  set 

a  (x)  =  Z  aix't  =  Z  (Z  ^V/)  (Z  Q/***)  (6) 

Note  that  in  (6)  and  in  other  similar  cases  the  summation  indices 
in  the  brackets  must  be  denoted  by  different  letters  to  avoid  con¬ 
fusion  when  the  brackets  are  removed.  Removing  brackets  and 
regrouping,  we  have 

Z  aM  =  t  Zft  a iXkPijQik  =  Z  Z  PiiQi*)  =  Z  a/*  Aft  (7) 

where  6,7,  is  the  Kronecker  delta.  If  /  k,  then  6j/,  =  0  and  these 
terms  are  disregarded.  If  j  =  k,  then  6jj  =  1  so  that  ajXjdjj  =  djXj. 
Therefore 

Z  ajXk6ri  =  Z  a,x,  (8) 

I.  k  / 

Comparing  (G)-(8),  we  finally  get 

a  (x)  =  Z  art  =  Z 

i  i 

which  states  that  the  numerical  value  of  a(x )  is  preserved  under 
a  change  of  basis. 

7.  In  a  linear  space  L  (which  may  be  infinite-dimensional)  let 
us  now  consider  all  possible  linear  forms,  that  is,  numerical  linear 


I.INF.AR  FORMS 


111 


§  I! 

functions  of  one  vector  argument.  We  will  regard  a  sum  of  func¬ 
tions  and  the  product  of  a  function  by  a  scalar  in  the  ordinary 
(arithmetical)  sense.  We  then  have 
Theorem  1.  The  set  L*  of  all  linear  functions  specified  in  the 
space  L  constitutes  a  linear  space. 

Proof.  We  first  demonstrate  that  the  sum  of  two  arbitrary  linear 
functions  a(x),  b(x)  is  a  linear  function.  Set 

c(x)  =  a  (x)  +  b  (at) 

Then 

c  (x  +  y)  ==  a  (x  +  y)  +  b  (x  +  y)  =  [a  (x)  +  a  (//)]  +  [b  (x)  -f  b  (y)\  = 
=  [a  (a)  +  b  (x)]  -f  [a  (y)  -f  b  (y)]  =  c  (x)  +  c  (y) 

Besides, 

c  (ax)  =  a  (ax)  -f  b  (ax)  =  a  a  (x)  +  ab  (x)  =  a  [a  (x)  +  b  (x)]  =  ac  (x) 

Thus,  the  linearity  of  the  sum  is  proved. 

We  now  show  that  if  a  linear  function  is  multiplied  by  an  ar¬ 
bitrary  scalar  h,  the  result  is  a  linear  function.  Let  c(x)  =  ha(x). 
Then 

c  (x  +  y)  =  la  (x  +  y)  =  ha  (x)  +  ha  (y)  —  c  (x)  +  c  (y) 
Furthermore 

c  (ax)  =  ha  (ax)  =  haa  (x)  —  ac  (x) 

We  have  thus  demonstrated  that  ha(x)  is  a  linear  function.  Now, 
if  a(x),  b(x)^L*,  then  a(x)-f  6(x)e  L*  and  ha(x)^L*  for 
any  h. 

The  zero  element  of  L*  is  a  (linear)  function  0(x)  equal  to  zero 
for  every  vector  x. 

The  function  ( — l)-a(x)  is  the  negative  of  a(x). 

It  is  easy  to  verify  that  all  the  axioms  of  a  linear  space  hold 
true  for  L*,  whence  follows  the  validity  of  Theorem  1. 

8.  Definition.  The  linear  space  L*  of  all  linear  functions  defined 
on  L  is  called  the  conjugate  space  associated  with  L. 

Remark.  According  to  the  definition  of  a  linear  function,  multi¬ 
plication  in  a  conjugate  space  is  admissible  by  the  same  scalars 
as  in  the  original  space.  In  other  words,  if  L  is  real,  then  L*  is 
real,  and  if  L  is  complex,  then  L*  is  complex. 

9.  Theorem  2.  If  a  linear  space  is  n-dimensional,  then  the  asso¬ 
ciated  conjugate  space  is  also  n-dimensional. 

Proof.  We  introduce  the  basis  t’i,  ...,  e„  in  L  and  expand  in 
terms  of  this  basis  an  arbitrary  vector  x  in  L: 


X  =  xle,  +  x.e,  +  . . .  +  xnen 


112 


LINEAR.  BILINEAR  AND  QUADRATIC  FORMS 


[CM.  IV 


Then  an  arbitrary  vector  a  from  the  conjugate  space  L*,  that  is 
to  say,  a  linear  function  a(x),  can  be  written  as 

a  (x)  =  a,x,  -f  a2x2  +  . . .  +  onxn 

and  is  uniquely  defined  by  specification  of  an  ra-tuple  of  coeffi¬ 
cients  (at,  ....  a„).  This  n-tuple  may  be  regarded  as  a  vector  in 
the  coordinate  space  K„.  When  linear  functions  are  added  or  mul¬ 
tiplied  by  a  scalar,  their  coefficients  are  also  added  and  multiplied 
by  that  scalar.  Therefore,  in  the  given  case,  L*  is  isomorphic  to 
the  coordinate  space  Kn  -  Theorem  2  is  proved. 

10.  To  conclude  this  section  we  consider  the  geometric  meaning 
of  a  linear  form.  To  do  this  we  will  take  advantage  of  the  affine 
space  and  consider  vectors  of  Ln  to  be  the  radius  vectors  of 
points  in  9l„  laid  off  from  a  certain  point  0.  We  assume  that  the 
value  of  the  function  a(x)  at  the  point  A  is  equal  to  its  value  on 
the  vector  x  —  OA.  The  function  a(x)  will  thus  be  defined  in  9ln. 

The  following  assertions  are  valid. 

1)  The  set  of  points  at  which  the  linear  function  a(x)  assumes 
a  constant  value  constitutes  a  hyperplane. 

(2)  Every  hyperplane  is  a  locus  of  points  at  which  a  certain 
linear  function  retains  a  constant  value. 

(3)  Hyperplanes  corresponding  to  distinct  values  of  a  given 
linear  function  a(*)  are  parallel. 

(4)  The  hyperplane  on  which  a(x)  =  0  passes  through  the  coor¬ 
dinate  origin. 

To  prove  these  facts  it  suffices  to  write  the  equation  a(x)  =  c  in 
terms  of  the  components: 

«,.v,  +  a,x-,  -f  ...  +anxn  =  c 

and  take  advantage  of  the  results  of  Sections  6  and  7  of  Chap¬ 
ter  III. 

§  2.  Bilinear  forms 

1.  A  numerical  function  a(x,  y )  of  two  vector  arguments  x 
and  y  is  said  to  be  bilinear  if  it  is  linear  in  each  of  the  arguments, 
that  is, 

( 1 )  a(x |  +  .v.,,  //)  =  a  (x, ,  y)  + a  (x2,  y), 

a  (ax,  y)=aa  (x,  y)\ 

(2)  a  (x,  y |  -f  =  a  (a,  y.)  +  a  (x,  y,), 

a  ( v,  <///)  =  a  a  (x,  y). 

Here,  .v,  y,  vt,  \-2,  y ,,  y2  are  any  vectors  in  the  space  L  and  a  is 
an  arbitrary  scalar. 


Bll. [NEAR  FORMS 


113 


§  21 

2.  Let  L  be  an  rc-dimensional  linear  space  and  eu  . . . ,  en  a  basis 
in  it,  and  let  the  arguments  of  the  bilinear  function  be  expanded 
in  terms  of  this  basis: 


Then 


*  =  E*A.  f/  =  E  me  i 

a  (x,  y)  =  a  (X  xteh  £  Ihfik)  =  E  *<*/*«  (eh  ek) 


(1) 


We  introduce  the  notation 


aik  =  a{ehek)  (2) 

and  obtain 

n 

a(x,  y)=  E  alkXitjk  (3) 

i,  fc=l 

Formula  (3)  expresses  the  function  a(x,  y)  in  terms  of  components 
relative  to  the  given  basis. 

The  polynomial  in  the  right  member  of  (3)  is  called  a  bilinear 
form.  Also,  the  function  a(x,  y)  itself  is  called  a  bilinear  form.  The 
numbers  a,/,  are  called  the  coefficients  of  the  given  form  relative 

to  the  basis  e . .  The  arguments  x  and  y  may  be  regarded 

as  vectors  of  real  linear  space  or  of  complex  linear  space.  Ac¬ 
cordingly,  we  say  that  the  form  a(x,  y)  is  given  in  real  space  or 
in  complex  space.  In  the  latter  case,  complex  numbers  are  admis¬ 
sible  as  values  of  the  form  a(x,y).  Likewise,  the  coefficients  a« 
are  also,  generally,  complex  numbers. 

3.  It  is  easy  to  demonstrate  that  the  set  of  all  bilinear  forms 
specified  in  a  linear  space  L  also  forms  a  linear  space  (if  we  un¬ 
derstand  addition  of  forms  and  multiplication  by  a  scalar  in  the 
ordinary  arithmetic  sense;  see  Section  1  where  the  proof  is  carried 
out  for  linear  forms) . 

4.  Let  us  consider,  in  the  given  basis  eu  . . . ,  en,  the  one-term 
bilinear  forms 

liAx,y)  =  xiyk  (4) 

From  (2)  and  (3)  we  have 

a  (x,  y)  =  E  anJik  (x,  y)  (5) 

If  we  take  x  —  et,  y  =  em  for  any  fixed  l  and  m,  then  hm  =  1 
and  all  other  forms  of  (4)  will  be  zero.  From  this  it  follows  that 
the  forms  (4)  are  independent  and  so  they  form  a  basis  in  the 
space  of  bilinear  forms.  Formula  (5)  gives  the  expansion  of  the 
bilinear  form  (1)  in  terms  of  the  basis  (4). 


Ill 


LINEAR.  BILINEAR  AND  QUADRATIC  FORMS 


|CH.  IV 


The  basis  (4)  consists  of  n2  elements,  and  so  the  space  of  bili¬ 
near  forms  has  dimension  n2. 


5.  A  bilinear  form  a(x,  y)  is  said  to  be  symmetric  if  for  any  x, 

y  e=  L 


a(x,  y)  =  a(y,  x) 


y 


A  bilinear  form  a(x,  y )  is  said  to  be  skew-symmetric  if  for  any  x, 

e  L 

a  (x,  tj)  —  —  a  (//,  x) 


In  the  case  of  a  symmetric  bilinear  form,  the  coefficients  are 
symmetric:  aih  =  am  (see  formula  (2)).  For  a  skew-symmetric 
form,  an,  =  —  ahi  and,  in  particular,  a,-,  =  0.  Both  symmetric  and 
skew-symmetric  bilinear  forms  form  subspaces  of  the  space  of  all 
bilinear  forms  with  arguments  in  L.  To  find  the  dimensions  of 
these  subspaces,  we  construct  bases  in  them. 

A  symmetric  bilinear  form  may  be  written  as 

«(*,  y)  =  £  atk  (xiVk  +  xkyt)  +  £  auXiijt  (6) 

i  <  k 

Consider  the  forms 

Tik  (x,  y)  =  xtyk  +  xkyt 
Tit(x,  y)  =  xiyi 

The  bilinear  forms  (7)  are  linearly  independent  and  symmetric 
and  any  symmetric  bilinear  form  is  expressible  in  terms  of  these 
forms  by  a  formula  of  type  (6).  For  this  reason,  the  forms  (7)  con¬ 
stitute  a  basis  in  the  subspace  of  all  symmetric  bilinear  forms.  The 

number  of  elements  in  the  basis  of  (7)  is  equal  to  C~n  +  n  = 
=  -jn(n+  1).  Such  also  is  the  dimension  of  the  subspace  of  sym¬ 
metric  forms,  whence  it  follows  that  for  any  choice  of  N  =  -^  n  («+l) 

independent  symmetric  bilinear  forms  wt(x,y),  Wn(x,y), 
an  arbitrary  symmetric  form  can  be  represented  as 

N 

a(x,  i /)=  £  kiW,(x,  y) 

i=i 

where  \{  are  numerical  coefficients. 

For  skew-symmetric  bilinear  forms  we  have 

«(•'-,  y)=  £  aiki-XiVk  —  xkyd 

i  <  k 


i  ¥=  k,  | 


(7) 


HI  LINEAR  EORMS 


115 


§  21 

and  for  a  basis  we  can  lake  the  forms 

hk  (x,  y)  =  xtyk  —  xktji 

They  total  N=Yn(n~  *)•  Hence,  for  any  choice  of  the  inde¬ 
pendent  skew-symmetric  forms  wv  ....  we  get  for  an  arbitrary 
skew-symmetric  form  a(x,  y)  the  representation 

N 

a  (*.  y)  =  Z  (x,  y) 

i  —  \ 

6.  Theorem.  The  space  of  bilinear  forms  is  a  direct  sum  of  the 
subspace  of  symmetric  and  the  subspace  of  skew-symmetric  bili¬ 
near  forms. 

Proof.  Clearly,  a  bilinear  form  is  simultaneously  symmetric  and 
skew-symmetric  only  when  it  is  zero,  whence  it  follows  from  Theo¬ 
rem  1,  Section  14,  Chapter  I,  that  the  sum  of  the  subspaces  under 
consideration  is  a  direct  sum. 

On  the  other  hand,  any  bilinear  form  a(x,  y)  may  be  represented 
as  the  sum  of  a  symmetric  and  a  skew-symmetric  form,  namely 

a  (x,  y)  =  \[a  (x,  y)  +  a  (y,  *)]  -f  \  [a  (x,  y)  —  a  {y,  *)] 

Hence,  the  direct  sum  of  these  subspaces  coincides  with  the  entire 
space.  The  proof  of  the  theorem  is  complete. 

7.  Now  let  us  pass  to  a  new  basis: 

e<  =  £*V/  (8) 

In  the  new  basis, 

*  =  £-«,  y  =  H  y'/k 

Because  the  form  a(x,  y)  is  invariant,  we  have 

a  (x,  y)  =  X  aikx.yk  =  £  a'ikx'ty'k  (9) 

where  a'ik  are  new  coefficients.  Quite  naturally,  the  invariance  of 
the  form  a(x,  y)  does  not  signify  the  invariance  of  its  coefficients 
(generally,  a'ik=j taj.  Let  us  express  the  coefficients  a'ik  in  terms 
of  the  old  coefficients  a,/,.  We  take  advantage  of  the  fact  that  the 
values  of  the  form  on  the  basis  vectors  coincide  with  the  coeffi¬ 
cients  of  the  form 

a'ik  =  a(e'i,  4)  (10) 

In  place  of  the  new  basis  vectors  ef,  ei  we  substitute  into  (10) 
their  expressions  (8): 

a  ( e'i ,  e'k)  =  a  (£  Ptieh  £  Pkle) 


Hf)  LINEAR,  BILINEAR  AND  QUADRATIC  FORMS  [Cl I.  IV 

Now,  since  the  form  a(x,y)  is  bilinear  in  each  of  the  arguments, 
we  get 

Q  (pit  &k)  ==  Z  Pi (**!’  & l ) 

Thus 

a'ik=  Z  altPltPkt  (I) 

/.  /=! 

Remark.  The  expressions  (I)  can  be  obtained  in  a  different  way, 
by  proceeding  from  (9).  Since  by  Section  5,  Chapter  II, 

*/  =  Z  PuXi  !//  ==  Z  PkiVk 

we  have,  from  (9), 

£  a,lxjyl  =  Z  a^P^P,/^  =  Z  a'ikx'y'k 

whence  we  again  find  (I).  Yet  it  is  evident  not  only  that  (I) 
follows  from  (9),  but  that  (9)  also  follows  from  (I).  Thus,  the  in¬ 
variance  of  the  form  implies  the  law  of  transformation  of  its  coef¬ 
ficients  by  equations  (I).  In  turn,  the  transformation  of  coefficients 
by  (I)  guarantees  the  invariance  of  the  form. 

§  3.  The  matrix  of  a  bilinear  form 

1.  Given  an  arbitrary  bilinear  form.  Expanded,  it  looks  like  this: 

a(x,  y)  =  T1alkxiyk 

=  auxly[  +  a]2xly2-{-  ...  +  a[nx,yn 
+  a2lx.,yt  -f  a22x2y2  +  ...  +  a2nx2yn 

4"  ^n\^nU\  4~  4-  •••  4"  ^nn^nl/n 

Writing  out  the  coefficients  in  the  form  of  an  array,  we  get  a 
square  matrix  called  the  matrix  of  the  bilinear  form: 

flII  a\2  ■  ■  •  a\n  | 
yj  _  «2I  a‘22  ■  ■  ■  a2n  I 


II  @n\  ®n2  •  •  •  Ann  II 

In  a  given  basis,  the  matrix  fully  determines  the  bilinear  form 
since  it  yields  all  the  coefficients. 

2.  Suppose  we  pass  to  a  new  basis: 

e'i  =  Z  Ptfi, 


THE  MATRIX  OE  A  BILINEAR  FORM 


1 17 


§  31 

In  the  new  basis,  the  form  a(x,  y )  has  a  different  matrix, 
A'  —  \\a'ik\l  The  elements  a'lk  of  the  matrix  A'  are  expressed  by 
the  formulas  (I)  of  Section  2.  Let  us  transform  these  formulas  so 
as  to  obtain  a  matrix  relation  that  expresses  (I)  in  its  entirety. 
Write  (I)  as 

=  (i) 

1. 1 

and  introduce  the  quantities 

=  (a) 

By  (1)  and  (a)  we  have 

a'tk  =  E  PuC/k  0) 

Now  form  the  matrix  C  —  ||Cj*||  with  the  usual  convention  that 

the  first  index  stands  for  the  row  number  and  the  second  for  the 
column  number. 

Relations  (a)  and  (P)  will  now  be  considered  from  the  stand¬ 
point  of  the  multiplication  of  matrices.  When  the  index  l  is  varied, 
dji  runs  over  a  row  of  matrix  A,  and  Phi  runs  over  a  row  of 
matrix  P.  Thus,  in  (a)  we  have  the  product  of  a  row  into  a  row. 
To  convert  a  row  into  a  column  it  suffices  to  take  the  transpose 
of  the  matrix.  Accordingly,  in  relation  (a)  we  will  regard  the  se- 
•  cond  factor  under  the  summation  sign  as  an  element  of  the  matrix 
P*  (the  transpose  of  P).  We  then  get  the  product  of  a  row  of  A 
by  a  column  of  P*.  In  other  words,  (a)  is  equivalent  to  the  matrix 
equation 

C  =  AP *  (a,) 

•  Now  consider  formula  (p).  It  is  immediately  apparent  that  on 
the  right  we  have  a  product  of  a  row  by  a  column,  and  so  from  (P) 
it  follows  that 

A'  =  PC  (P,) 

From  (ai)  and  (Pi)  we  get  the  desired  formula: 

A'  =  PAP"  (2) 

Formula  (2)  expresses  the  matrix  of  the  bilinear  form  in  a  new 
basis  in  terms  of  the  matrix  P  and  of  the  matrix  of  this  form  in 
the  old  basis;  the  change  from  the  old  to  the  new  basis  is  made 
with  the  aid  of  P. 

3.  Conclusions  from  formula  (2).  Note  that  P  and  P*  are  non¬ 
singular  matrices.  From  this  fact  and  by  the  theorem  on  the  rank 
of  a  matrix  product  (Chapter  II,  Section  4)  we  have 

rank  A' = rank  A 


(3) 


118  1  INEAR,  BILINEAR  AND  QUADRATIC.  I'ORMS  [CII.  IV 

Definition.  The  rank  of  a  bilinear  form  is  the  rank  of  its  matrix. 
Because  of  (3),  the  rank  of  a  bilinear  form  is  an  invariant  rela¬ 
tive  to  change  of  basis  and  is  thus  a  quantity  related  to  the  form 
itself,  irrespective  of  its  coordinate  representation.  Somewhat  later 
(in  Section  11)  we  will  give  a  geometric  interpretation  of  the  rank 
of  a  bilinear  form. 

4.  Let  us  consider  the  determinant  of  the  matrix  of  a  bilinear 
form  in  some  basis: 

A  =  det  A 

In  another  basis.  A'  —  det  A'.  From  (2)  and  from  the  theorem  on 
multiplication  of  determinants  it  follows  that 

A'  =  A  ■  (det  Pf  (4) 

Thus,  the  determinant  of  the  matrix  of  a  bilinear  form  is  not  an 
invariant  but  changes  with  a  change  of  basis  by  formula  (4). 

5.  Given  a  bilinear  form  a(x,  y)  =  Yj  anxil/i  whose  determinant 

\  #=  0,  and  an  arbitrary  linear  form  &  (x)  =  Z  ft,*,.  We  can  then 
choose  y  so  that  a(x,  y)  —  b(x)  for  any  rei  (with  y  fixed).  This 
can  be  done  by  finding  . . . yn)  from  the  system 

Z  «,/'//  = 

whose  determinant  A  0.  Hence,  one  bilinear  form  a(x,  y)  con¬ 
tains,  as  it  were,  all  possible  linear  forms  specified  in  L. 

§  4.  Quadratic  forms 

1.  Given  a  symmetric  bilinear  form  a(x,  y):  a(y,  x)  =  a(x,  y). 
This  is  equivalent  to  its  matrix  being  symmetric  in  any  basis? 
A*  =  A.  Indeed, 

@ik  :==='  &  ^k)  ==  ®  (@k>  ^i)  @kl 

Identifying  the  two  arguments  of  the  form  a(x,  y),  we  get 
a(x,  x)  a  (  v,  y)  for  y  =  x. 

The  function  a(x,  x)  is  called  a  quadratic  form  and  corresponds 
to  the  given  symmetric  bilinear  form  a(x,  y). 

The  original  (symmetric)  bilinear  form  a(x,  y)  is  said  to  be  the 
polar  of  the  quadratic  form  a(x,  x). 

2.  We  will  prove  that  the  polar  bilinear  form  is  uniquely  de- 
lennined  by  its  quadratic  form. 

Suppose  we  have  a  numerical  function  f{x)  of  a  vector  argu¬ 
ment,  and  suppose  f(x)  is  some  quadratic  form,  i.e.,  /(*)  = 
=  a  ( .v ,  v),  and  a(x,  y)  is  unknown.  To  find  it,  we  consider 


QUADRATIC  FORMS 


119 


§  4J 

Hx  +  y),  where  x,  y  are  arbitrary  vectors.  Taking  advantage  of 
the  properties  of  a  bilinear  form  and  its  symmetry,  we  have 

f{x  +  y)  =  a(x  +  y,  x  +  y)  =  a  (x,  x)  +  a  (x,  y) 

+  a  (y,  x)  +  a  (//,  y)  =  f(x)  +  2 a  (x,  y)  +  f  (y) 
whence  the  desired  expression: 

a  (x,  y)  =  j[f(x  +  y)  —  f(x)  —  f  (//)]  (1) 

3.  Formula  (1)  may  be  taken  for  the  definition  of  a  quadratic 
form.  Namely,  we  can  say  that  a  quadratically-homogeneous  func¬ 
tion  f(x)  (that  is,  a  function  f  such  that  f(ax)  =  a2f(x)  for  any 
number  a)  is  called  a  quadratic  form  if  the  right-hand  side  of  (1) 
is  a  bilinear  function. 

Note  that  the  definition  of  a  quadratic  form  does  not  provide 
for  a  basis,  which  means  it  is  applicable  in  infinite-dimensional 
spaces. 

4.  Example.  Let  L  be  a  linear  space  of  functions  continuous  on 
the  interval  [0,  1]. 

Consider  the  function 

i 

f(x)=\  [x(t)]2dt 

c 

.  the  argument  of  which  x  ==  x(t)e  L.  Here  it  is  clear  that  f  (ax)  = 
=  a 2f(x).  Besides,  we  have 

I 

\  If  (x  +  y)  —  f  (X)  —  f  (y)]  =  ^x(t)  y  (t)  dt  (2) 

0 

It  is  easy  to  verify  that  the  right  member  of  (2)  is  a  bilinear 
form.  Thus,  f(x)  is  a  quadratic  form  in  the  infinite-dimensional 
space  L. 

Later  on  (Section  10)  we  will  see  that  important  implications 
follow;  for  instance,  it  will  be  possible  to  prove  integral  inequali¬ 
ties  with  the  aid  of  purely  algebraic  theorems. 

5.  Let  us  return  to  the  n-dimensional  case.  In  n-dimensional 
space  we  consider  a  quadratic  form  and  write  its  expression  in 
terms  of  the  components  of  the  arguments. 

Let  a  (x,  y)=  a  (y,  x),  x  =  y.  Then 

f(x)  =  a  (x,  x)  =  2  alkxtxk 

—  auX\Xx-\-  anX\X>-\r  •••  +«iraX|X„ 

+  a2 1*2*  |  +  (c,  ,x.,x,  4-  ...  4-  ci2nx2xn  ^ 

4"  +  "bjVs  4-  •••  4~  annxnxn 


120 


LINEAR,  BILINEAR  AND  QUADRATIC  FORMS 


(Cl!.  IV 


If  we  take  into  account  the  symmetry  of  the  coefficients,  the 
terms  of  the  sum  (3),  with  the  exception  of  the  diagonal  terms, 
naturally  combine  into  pairs.  We  then  obtain  a  frequently  used 
notation  for  quadratic  forms  which  looks  like  this: 

f(x)  =  anx]  +  2al2xlx2  +  2al3xl<3+  ...  +2  alHxlxn 
+  a22xl  +  j;  Y2X:i  •  •  •  “f"  ^a2nX2Xit 


Note  that  in  the  first  row  of  (4)  we  have  all  the  terms  contain¬ 
ing  X|. 


6.  The  matrix  of  a  quadratic  form  is  the  matrix  of  its  polar  bili¬ 
near  form: 


flit  . 

. .  a]n 

an  1  • 

•  @nn 

From  this  definition  it  follows  immediately  that  the  matrix  of  a 
quadratic  form  transforms  by  the  formula 

A'  =  PAP * 


which  was  proved  in  the  preceding  section. 

7.  By  definition,  the  rank  of  a  quadratic  form  is  equal  to  the 
rank  of  its  matrix:  r  —  rank  A. 

8.  Quadratic  forms  have  important  geometric  applications  which 
will  be  considered  below  in  Chapters  VIII  and  XI.  For  the  present 
we  will  not  relate  any  geometric  objects  to  quadratic  forms  and 
will  examine  their  properties  from  the  algebraic  point  of  view. 

9.  If  in  a  certain  basis  it  turns  out  that  all  coefficients  aik  =  0 
for  i  =/=  k,  then  wc  say  that  the  quadratic  form  is  canonical  in  that 
basis: 

f(x)  =  anx]  +  a22xl+  ...  +annxl 

In  order  to  obtain  the  canonical  form  of  a  quadratic  form,  the 
basis  must  be  chosen  in  a  special  way.  In  an  arbitrary  basis,  a 
quadratic  form  is  complete,  that  is  to  say,  it  has  all  terms, 
generally  speaking. 

The  reduction  of  a  quadratic  form  to  canonical  form  is  an  im¬ 
portant  problem  of  both  theoretical  and  applied  mathematics. 


LAGRANGE'S  METHOD 


121 


§  SI 


Below  we  give  two  methods  for  reducing  a  quadratic  form  to 
canonical  form:  the  Lagrange  method  and  the  Jacobi  method. 


10.  If  a  form  has  been  reduced  to  the  canonical  form,  then  its 
matrix  becomes  diagonal: 


“22 

0 

0 

« nn 

(5) 


Since  the  rank  of  a  quadratic  form  is  an  invariant,  it  is  equal  to 
the  number  of  nonzero  diagonal  elements  of  matrix  (5). 

If  the  rank  =  r  <  n,  then  after  an  appropriate  change  of  the 
numbers  of  the  entries  the  matrix  (5)  may  be  written  as 


al\ 

0 

“ rr 

0 

0 

0 

11.  Remark.  If  a  quadratic  form  is  reduced  to  canonical  form, 
then  its  bilinear  form  is  at  the  same  time  reduced  to  diagonal 
form: 

a  (. x ,  tj)  =  a  i  ,*!«/,  +  a22x2y2  +  . . .  +  annxnyn 

§  5.  Reducing  a  quadratic  form  to  canonical  form 
by  Lagrange’s  method 

1.  Given  a  quadratic  form  /(*)  =  a(x,  x).  By  formula  (4),  Sec¬ 
tion  4,  we  can  write  f(x )  in  any  basis  as 

f  (x)  =  aux]  +  2  a12xrv2  +  . . .  +  2  alnxtxn  +  g(x2,  ...,  xn)  (1) 

where  g  is  a  quadratic  form  that  does  not  include 
The  notation  (1)  makes  it  possible  to  prove  that  a  quadratic 
form  can  be  reduced  to  canonical  form  by  induction. 

Theorem.  Every  quadratic  form  can  be  reduced  to  canonical 
form  by  means  of  a  nonsingular  linear  transformation. 

Remark.  Here  it  is  a  question  of  transforming  variables,  namely 
the  numerical  arguments  xu  . . . ,  xn  of  the  polynomial  (1).  But  the 


122 


LINEAR.  BILINEAR  AND  QUADRATIC  FORMS 


[CH.  IV 


theorem  can  also  be  understood  geometrically  since  any  nonsin- 
gular  transformation  of  variables  may  be  regarded  as  a  transfor¬ 
mation  of  coordinates  in  a  change  of  basis  (see  Chapter  II). 

2.  Proof  of  the  theorem.  A  quadratic  form  in  one  variable  always 
has  the  canonical  form  aux\.  For  the  hypothesis  of  induction  we 
assume  that  any  quadratic  form  in  (n — 1)  numerical  arguments 
can  be  reduced  to  canonical  form  by  a  nonsingular  linear  trans¬ 
formation  of  (n  —  1)  variables. 

We  consider  an  arbitrary  quadratic  form  f (x)  in  n  numerical 
arguments: 

f  M  =  Z  (lijXiX, 

Using  the  induction  hypothesis,  we  will  prove  that  the  quadratic 
form  can  be  reduced  to  canonical  form  by  a  nonsingular  linear 
transformation  of  n  variables.  Two  cases  are  possible. 

First  case.  In  the  quadratic  form  f(x)  at  least  one  of  the  coeffi¬ 
cients  a,,  of  the  squares  of  the  variables  is  different  from  zero. 
Without  any  loss  of  generality  we  can  assume  that  au  ¥=  0.  We 
set  up  the  following  linear  transformation  with  respect  to  the  given 
coefficients  of  the  form  f(x): 

y\  =anxi  +  . . .  | 

i/2  =  *2  |  (2) 

f/n  =  xn  I 

Denote  by  Q  the  matrix  of  this  transformation: 


«n 

(l  12  • 

■  a  In 

Q  = 

0 

l  . 

.  0 

0 

0  . 

.  1 

The  transformation  (2)  is  nonsingular  since  det  Q  =  On  0.  Also 
note  that  the  nonsingularity  of  transformation  (2)  follows  from  its 
one-to-oneness,  which  in  turn  is  immediately  apparent  from  for¬ 
mulas  (2). 

Square  the  expression  y\  and  divide  by  an  =£  0: 

7J7T  ^  =  7^  +  <*,2*2  +  •••  a  ln*n)2 

«II  Vl  +  ^«I2*I*2  +  •••  +  ^ain*l*n  +  T  (*2’  •••*  *n) 

where  t|>  is  a  quadratic  form  in  the  arguments  x2 . xn,  that  is 

to  say,  t|>  does  not  include  x^.  Now  let  us  introduce  another  quad- 


LAGRANGE’S  METHOD 


123 


§  5] 

ratic  form  in  the  same  arguments  x2,  . . . ,  xn,  setting 

i|> (x2,  xn)  =  g (x2 . Xn)  —  (f (x2,  ....  xn) 

where  g(x 2,  . . . ,  xn)  is  given  in  the  notation  of  f(x )  in  (1).  Then 
we  get 

=  . *«) 

or,  what  is  the  same  thing, 

f(x)  =  ^y]  +  ^(i/2,  yn) 

By  the  induction  hypothesis,  there  is  a  nonsingular  transforma¬ 
tion  of  n  —  1  variables 

n 

Zk  =  Z  RuUh  k  =  2,  n  (3) 

1=2 

which  reduces  the  form  i|)  to  canonical  form: 

Hy* . -0  =  &2222  +  •”  +  bnnZl 

We  complete  the  transformation  (3)  so  that  all  n  variables  par¬ 
ticipate.  Namely,  we  put 

Z\=V\, 

Z2  —  R22I/2  +  •  ■  •  +  Rinl/n’ 


Zn  Rn2lf2  ”1“  •  •  •  RnnVn 

We  transform  the  variables  xt,  . . . ,  x„  into  the  variables 
V\,  •••.  yn  by  formulas  (2),  and  then  transform  the  variables 
i/i,  .  . . ,  yn  by  formulas  (4)  to  get  a  transformation  of  the  variables 
*i,  . . . ,  xn  into  the  variables  zu  . . . ,  zn,  which  reduces  the  original 
quadratic  form  to  the  canonical  form 

f  to  =  “  2i  +  KA  +  •••  +  bnA 

This  transformation  is  nonsingular  since  it  is  the  product  of  non¬ 
singular  transformations  (2)  and  (4). 

Second  case.  All  diagonal  coefficients  a,*  in  the  quadratic  form 
f(x)  are  zero.  Then  the  foregoing  does  not  apply.  But  one  of  the 
nondiagonal  coefficients  is  nonzero.  Let  it  be  012.  Then  the  quadra¬ 
tic  form  has  the  form 


f  (x)  =  2anx{x2-\-  . . . 


(5) 


LINEAR.  BILINEAR  AND  QUADRATIC  FORMS 


[CII.  IV 


121 

Make  the  transformation 

*i  =  x  |  +  x>, 

X2  =  Xt—  X2l 

X3  =  X3,  (6) 


xn  xn  ) 

The  transformation  is  one-to-one  and,  hence,  is  nonsingular. 

Substituting  the  quantities  of  (6)  into  the  quadratic  form  (5), 
we  get 

f(x)  =  2al2x*-2al2xl+  ...  (7) 

The  term  2 al2x]  cannot  vanish  when  collecting  terms  because  all 
terms  of  the  quadratic  form  that  are  not  written  out  in  expression 
(5)  do  not  contain  the  product  X\X2  and  cannot  yield  x]  via  trans¬ 
formation  (6). 

Furthermore,  the  quadratic  form  (7)  can  be  reduced  to  cano¬ 
nical  form  by  a  nonsingular  transformation  since  we  have  the 
first  case  here:  the  coefficient  of  x]  is  nonzero.  This  finishes  the 
reasoning  by  induction  and  hence  the  proof. 

3.  Remark.  From  the  proof  it  is  evident  that  a  quadratic  form 
with  real  coefficients  can  be  reduced  to  canonical  form  by  a  non¬ 
singular  linear  transformation  which  also  has  real  coefficients. 

§  6.  The  normal  form  of  a  quadratic  form 

1.  Suppose  a  quadratic  form  f(x)  is  reduced  to  the  canonical 
form 

r 

f  (x)  =  Z  arx]  (1) 

whore  flu . arT  =£  0  and  r  is  the  rank  of  f(x). 

2.  Suppose  we  are  dealing  with  a  complex  space  and  allow  for 
the  use  of  linear  transformations  with  complex  coefficients.  Set 

>Ji  =  V an  Xi  if  i<r,  j 
yl=  Xi  if  i  >  r  ) 

From  (1)  and  (2)  we  get 

f(x)  =  y]+  ...  +y2  (3) 

assuming  that  . .  yr+ 1,  . . . ,  y„  are  the  new  components 

(coordinates)  of  the  vector  x.  The  expression  (3)  is  said  to  be  the 


THE  LAW  OF  INERTIA  OF  QUADRATIC  FORMS 


125 


§  7) 

normal  form  of  the  quadratic  form  f(x).  Noticing  that  the  trans¬ 
formation  (2)  is  nonsingular,  we  draw  the  following  conclusion. 

In  complex  space,  any  quadratic  form  may  be  reduced  to  the 
normal  form  (3)  by  means  of  a  nonsingular  linear  transforma¬ 
tion. 

3.  We  now  confine  ourselves  to  real  spaces  and  real  linear  trans¬ 
formations.  Taking  into  account  that  there  may  be  negative  coef¬ 
ficients  among  the  coefficients  a,,,  we  get 

Pi  =  Vl«<il  Xi  if  /</-,  |  ^ 

yt  —  x ,  if  i  >  r  ) 

If  the  first  k  coefficients  ai;  are  positive  and  the  remaining  are  ne¬ 
gative,  then  from  (1)  and  (4)  we  obtain 


f  M  =  u]  +  •••  +y2k  —  yl+ —y;  (5) 

Expression  (5)  is  also  called  the  normal  form  of  the  form  f(x). 
Thus,  in  real  space,  any  quadratic  form  may  be  reduced  to  the 
normal  form  (5)  with  the  aid  of  nonsingular  real  linear  transfor¬ 
mations. 

4.  In  the  next  section  we  will  prove  that  in  real  space  the 
*  number  of  positive  and  negative  terms  in  (5)  does  not  depend  on 
which  particular  (real)  transformation  is  used  to  reduce  the  quad¬ 
ratic  form  to  normal  form. 

§  7.  The  law  of  inertia  of  quadratic  forms 

1.  Suppose  in  real  space  we  have  a  quadratic  form  of  rank  r: 

f  (x)  =  Z  aikXiXk 

where  {*,}  are  the  components  of  the  vector  x  relative  to  a  certain 
basis  ei,...,  e„. 

Let  <?i . e„  be  a  basis  in  which  f(x)  has  the  normal  form: 

f(x)  =  y-l+  ...  +y\  —  y\+l—  •••  —  l/f  (1) 

Here,  {«/,)  are  the  components  of  the  vector  x  relative  to  the  basis 

6\,  ...»  €n. 

2.  The  number  of  positive  and  the  number  of  negative  terms  in 
formula  (1)  go  respectively  by  the  names  positive  and  negative 
index  of  the  form;  the  difference  between  the  positive  index  and 
the  negative  index  is  called  the  signature. 


LINEAR,  BILINEAR  AND  QUADRATIC  FORMS 


[CM.  IV 


tar. 

3.  Theorem  (law  of  inertia  of  quadratic  forms).  The  positive 
index  and  the  negative  index  are  invariants  of  a  quadratic  form , 
that  is  to  say,  they  are  independent  of  any  choice  of  basis  in  which 
it  has  a  normal  form. 

Proof.  Let  there  be  another  basis  e\  ...,  en  relative  to  which 
the  form  f(x)  has  the  normal  form 

f(x)  =  z\+  ...  +4-4+1-  •••  ~4  (2) 

where  {2,}  are  the  components  of  x  relative  to  the  basis  eit  ....  en. 
It  is  required  to  prove  that 

k  =  m 

Suppose  that  It  #  m,  for  instance,  that  k  >  m.  We  consider  the 
formulas  for  transformation  of  coordinates: 

Zi  =  £  Qii'Ji  (3) 

Note  that  the  matrix  Q  of  coefficients  Q,j  is  nonsingular. 

Substituting  (3)  into  (2),  we  should  get  (1).  Hence  we  have 
the  identity 

*i+  •••  +4-4+I  -  •••  ~4 

=  t/]+  ...  +y\-y\+x- •••  -y2r  (4) 

which  is  true  for  any  yu  . . . ,  yr,  yr+\,  ....  yn  assuming  that 
Z\,  , . . ,  z„  are  expressed  in  terms  of  y\,  . .  . ,  yn  with  the  aid  of  (3). 

Let  us  set  up  the  following  auxiliary  homogeneous  system  of 
equations: 

Qni/i  +  •••  +  Qifei/*  —  0,  | 

.  (5) 

Qmli/l  +  •  •  •  +  QmkUk  —  0  J 

The  number  of  unknowns  in  (5)  is  greater  than  the  number  of 
equations  because  of  the  assumption  that  k  >  m.  Therefore,  the 
system  (5)  has  a  nontrivial  solution  yt,  ....  yu .  Substitute  this 
solution  into  the  identity  (4)  with  the  condition  that 

'/*+!=  •••  =0r  =  0r+ 1=  •••='/«  =  0  (6) 

Then,  taking  into  account  (3),  (5)  and  (6),  we  get 

//]+  •••  +y\  =  —  4+ 1  -  •••  -4  (7) 

But  tli is  is  impossible  since  the  left  member  of  (7)  is  strictly  posi¬ 
tive  whereas  the  1  iglit  member  is  either  negative  or  equal  to  zero. 
Heine,  k  cannot  exceed  m.  In  similar  fashion  it  is  established 
that  m  cannot  exceed  k.  Therefore  k  —  m,  and  the  theorem  is 
proved. 


§81 


JACOBIS  METHOD 


127 


§  8.  Reducing  a  quadratic  form  to  canonical  form 
by  Jacobi’s  method 


1.  Given  a  quadratic  form  f(x)  which  is  written  out  in  the  com¬ 
ponents  of  some  basis  eh  . . . ,  e„: 


Recall  that 


f  (x)  =  a(x,x)  =  'Z  aikxixk 

(lik  Q  (t?t ,  Gk) 


Form  the  matrix  of  the  quadratic  form  f(x): 


On 

a\2 

0|3  • 

•  aln 

Cl'2l 

a22 

O.3  . 

■  a2 n 

031 

033 

O33  • 

■  a3n 

Gfi\ 

an  2 

0*3  • 

■  onn 

Now  consider  the  so-called  principal  minors  of  A: 

Cl  1 1  d  1 2 


A|  —  a  u,  A.)  — 


^21  ^-22 


^3 


0|i  a\2  aI3 

a2\  a22  a23  1  •  •  •  > 

fl3l  a32  a33 


A„  =  det  A 


(1) 


Also,  for  convenience,  we  introduce  the  quantity  A0  assuming 
Ao  —  1  • 

The  Jacobi  method  is  based  on  the  assumption  that  all  principal 
minors  of  the  matrix  A  are  nonzero: 

A!  =#=  0,  A2  =/=  0,  . . . ,  A„  =f=  0  (2) 


We  then  seek  a  special  new  basis  such  that 
e\  ~  Pwev 
e2  —  P2\e\  P22e2’ 

ek  ~  Pk\e\  p  kie  2  +  •  •  •  +  pkkek> 

en  =  pn\e\  +  pn2e2  +  '  '  '  +  •  •  •  +  PnnGn 

In  order  to  reduce  the  quadratic  form  f(x)  to  canonical  form, 
it  suffices,  for  any  k  (1  <  k  sc:  n),  to  ensure  the  conditions 

a  (ei>  ek)  ~  a\k  ~  0  f 1 °r  <  =  1 »  2,  ....  k  —  1 


(4) 


128 


LINEAR,  BILINEAR  AND  QUADRATIC  FORMS 


[CII.  IV 


Then  the  a'ki  will  also  be  equal  to  zero  (because  the  matrix  of  the 
quadratic  form  is  symmetric),  and  only  the  coefficients  of  the 
squares  of  numerical  arguments  will  turn  out  to  be  different  from 
zero. 

2.  Observe  that  to  fulfil  conditions  (4)  it  suffices  to  require  that 
the  following  equations  hold  true: 

a(er  e*)  =  0.  /=!,  2 . k—  1;  k=\,  2 (5) 

Indeed,  from  (3)  and  (5)  we  have 

«(<*  e'k)  =  a(Pne i+  •••  +  Puei>  e'k) 

=  Pna(ev  <)+  •••  +Plia(ei>ek)  =  ° 

To  simplify  subsequent  derivations  add  to  (5)  the  supplementary 
equation 

a  e'k)  =  1  (6) 

3.  When  k  —  1,  the  conditions  (5)  vanish  and  only  (6)  remains, 
from  which,  taking  into  account  the  first  row  of  (3),  we  get 

1  =  a  (e„  <)  =  />(>,,  ex)  =  Pnan 

whence 


since  an  0. 

Taking  into  account  the  notation  of  (1),  we  can  write 


4.  From  now  on  we  argue  by  induction.  Assume  that  all  coef¬ 
ficients  appearing  in  the  first  k  —  1  rows  of  formulas  (3)  have 
been  determined.  To  find  the  coefficients  appearing  in  the  &th  row, 
we  write  the  conditions  (5)  and  (6)  together: 

u  (er  ek)  ~  ’  a  ek )  ~  a  (ek,  ek )  =  1  (7) 

From  this,  using  (3)  we  get  the  following  system  of  equations  for 
the  desired  coefficients: 

<Z\\Pk\  +«I2^*2  +  •••  +A|  kPkk  =0, 

\Pk\  +  Clk-\ -iPk2  +  •••  +  a*-|  kPkk  ~  0, 

ak\Pk\  +cik2Pk2  +  •••  +dkkPkk  =1 

The  determinant  of  system  (7a)  coincides  with  Ah  and  is  non¬ 
zero  because  of  assumption  (2).  The  desired  coefficients  Ph i, 


QUADRATIC  FORMS  OF  FIXED  SIGN 


129 


§  0) 


Pith  will  therefore  be  found.  It  remains  to  verify  that  the  con¬ 
structed  transformation  is  nonsingular.  With  this  in  mind  we  find 
the  coefficient  Phh  from  system  (7a).  Applying  Cramer’s  rule,  we 
obtain 


aw  •  •  •  flit- 1  0 

.  __  \k-i 

ak-l l  • • •  ak-i  k-\  0  Aft 
flju  •••  flfeft-i  1 


(8) 


Then,  using  the  triangular  structure  of  the  matrix  of  transforma¬ 
tion  (3),  we  find  the  determinant  D  of  that  matrix: 


D  =  PuP22 


Aq  a  i 

A,  '  A 2 


A»  -  I  _  I 

An  An 


Thus,  D  #=  0  and,  hence,  the  transformation  (3)  is  nonsingular. 


5.  Now  we  can  determine  the  coefficients  of  the  quadratic  form 
in  the  new  basis  e',  . . .,  e'n.  All  we  need  to  do  is  compute  the  dia¬ 
gonal  coefficients  since  all  the  others  are  zero  anyway.  Utilizing 
(3),  (7)  and  (8),  we  find 

akk  ~  a  {ek’  ek )  “  a  {.^k\e\  4*  •  •  •  +  P kftek’  ek ) 

•  ==  ? kka  ( ek •  ek )  ~~  ^kk  ~  L\k 


Hence,  relative  to  the  basis  constructed  by  the  Jacobi  method, 


=  +  £«)’+ 


An 


§  9.  Positive  definite  and  negative  definite  quadratic  forms 

1.  In  this  section  we  consider  only  real  spaces. 

Given  in  a  linear  space,  possibly  infinite-dimensional,  a  quad¬ 
ratic  form  f(x). 

Definition  1.  The  quadratic  form  f(x)  is  said  to  be  positive  de¬ 
finite  if  f(x)>  0  for  all  x  =£  Q. 

Note  that  /(0)=O  always.  Indeed,  since  0  =  0-z  and  /(*)  = 
=  a(x,  x),  where  z  is  an  arbitrary  vector,  a(x,  ij )  is  a  bilinear 
function,  it  follows  that 

/  (0)  =  a  (0  •  z,  0  •  z)  —  0  •  a  (z,  z)  =  0 

Definition  2.  The  quadratic  form  /( x)  is  said  to  be  negative  de¬ 
finite  if  f(x)C  0  for  all  x  0. 

It  thus  suffices  to  consider  positive  definite  forms  since  negative 
definite  forms  are  obtained  from  the  former  by  a  change  of  sign. 


5—661 


UNEAR,  BILINEAR  AND  QUADRATIC  FORMS 


[CM.  IV 


1.10 

2.  Confining  ourselves  to  quadratic  forms  in  finite-dimensional 
(n-dimensional)  spaces,  we  first  of  all  note  a  series  of  simple  ne¬ 
cessary  features  of  positive  definiteness.  Suppose,  relative  to  some 
basis  e\ . en,  we  have  a  quadratic  form 

f(x)  =  a(x,  x)  =  £  alkxtxk 


Recall  that  a;*  =  a(eit  eh) . 

(1)  If  f(x)  is  positive  definite,  then  a,,  >  0  for  all  <  =  1,2 . n. 

Proof. 

an  =  a(eh  e,)  =  /(e,)  >  0 


Remark.  This  condition  is  not  at  all  sufficient  for  the  form  to  be 
positive  definite.  Here’s  an  example.  The  form 

/(*)  =  **+  1000*,*,  + 4 

has  an  =  1  >  0,  but  on  the  vector  (—1,  1)  it  assumes  a  negative 
value. 

(2)  If  the  form  f(x)  is  positive  definite,  then  the  determinant  of 
its  matrix  is  positive: 

A  =  det  A  >  0 


To  prove  this,  we  reduce  f(x)  to  canonical  form.  Let  e\,  . . .,  e'n 
be  a  canonical  basis,  that  is,  a  basis  in  which  f(x)  is  of  canonical 
form: 

/(*)-<,«)*+  •••  +<nUf 

According  to  the  preceding  characteristic,  all  o',  >  0. 

Denote  by  A'  the  determinant  of  the  matrix  of  the  form  f(x)  in 
the  canonical  basis.  We  have 


A' 


0 


0 


—  a, 


<«>  o 


On  the  other  hand,  by  formula  (4)  of  Section  3, 

A'  =  A  (det  P)2 

Hence  A  >  0. 

Remark.  Neither  is  this  condition  sufficient  for  the  quadratic 
form  to  be  positive  definite.  An  example:  the  form 

f(x)  =  —x2  —  x\ 

has  A  >  0,  but  \{x)  ^  0. 


QUADRATIC  FORMS  OF  FIXED  SIGN 


131 


S  9) 


(3)  In  n-dimensional  space  every  positive  definite  form  has 
rank  n.  The  proof  follows  from  the  inequality  A#0. 


3.  Theorem  (Sylvester’s  criterion).  For  a  quadratic  form  to  be 
positive  definite,  it  is  necessary  and  sufficient  that  all  the  principal 
minors  of  its  matrix  be  positive. 

Necessity.  Let  the  form  f(x)  be  positive  definite.  Take  an  ar¬ 
bitrary  basis  . . .  eh,  ....  en  and  construct  the  linear  hull 

L(eu  ...,  eh).  Now  consider  the  quadratic  form  f(x)  not  on  the 
whole  space  but  only  on  the  subspace  L(eif  . . . ,  e/,). 

If  xeL(e,,  . ..,  ek),  then  x  =  {x,,  ....  xk,  0,  ....  0}  and 

k 

f  (*)==.  Z  al}x,x, 


All  the  remaining  terms  whose  coefficients  have  one  of  the  two 
indices  greater  than  k  vanish  because  of  the  zero  values  of  the 
coordinates. 

On  the  subspace  L(eit  . . . ,  eh)  the  form  f(x)  is  positive  definite 
since  it  is  positive  definite  on  the  whole  space.  Therefore  the  de¬ 
terminant  of  the  form  f(x)  considered  on  L(e (,  . . . ,  eh)  is  positive: 


A*  — 


an 


ak\ 


a\k 


akk 


>  0 


But  As  is  a  principal  minor  of  order  k  of  the  matrix  of  the 
quadratic  form  f(x),  and  the  index  k  can  assume  the  values 
1,2 ,  ,n.  The  necessity  proof  is  complete. 

Sufficiency.  Let  A*  >  0  for  k  =  1 ,  ....  n. 

Reduce  the  quadratic  form  to  canonical  form  by  the  Jacobi  me¬ 
thod.  This  yields 


+  •••  +  ^«): 


An—  i 


If  x  0,  then  at  least  one  of  the  coordinates  x'k  =f=  0,  and,  hence, 
f(x)  >  0,  which  completes  the  proof  of  the  theorem. 


4.  Take  note  of  the  two-dimensional  case.  Let 
f  =  ax2  -j-  2b xy  -}-  cy 2 

where  this  time  the  numerical  arguments  of  the  form  are  denoted 
by  x,  y. 

Sylvester’s  condition  reduces  to  the  inequalities 


a  >  0, 


a  b 
b  c 


ac  —  b2  >  0 


5* 


132 


LINEAR,  BILINEAR  AND  QUADRATIC  FORMS 


[CM.  IV 


Quite  naturally,  in  the  two-dimensional  case,  Sylvester’s  criterion 
can  be  established  without  any  special  theory  since  for  positive 
definiteness  it  is  necessary  that  a  >  0,  and  for  a  >  0  we  have 

f  =  7  [(ax  +  by)2  +  ( ac  —  b2)  y2] 


§  10.  Gram’s  determinant.  The  Cauchy-Bunyakovsky  inequality 


1.  Suppose  that  in  an  arbitrary  linear  space  L  (possibly  infinite¬ 
dimensional)  there  is  given  a  quadratic  form  f(x)—a(x,  x)  and 
a  finite  system  of  vectors  p\ . pr,. 

Definition.  The  Gram  determinant  for  the  quadratic  form  a(x,  x) 
and  the  system  of  vectors  p\,  . . . ,  ph  is  defined  to  be  the  quantity 


G(p\, 


Pk)  = 


a(p i,  Pi)  •• 

•  a(Pi,  Pk) 

«(Pt.  Pi)  •• 

■  aiPk .  Pk) 

Determinants  of  this  kind  are  frequently  encountered  in  mathe¬ 
matical  physics  and  the  theory  of  integral  equations. 


2.  Theorem.  Let  the  space  L  be  real  and  the  quadratic  form 
a(x,  x)  positive  definite.  Then  G(pt,  ...,  Pi,)>  0  if  the  vectors 
Pi,  ...,  pi,  are  linearly  independent.  If  the  vectors  plt  ...,  ph  are 
dependent ,  then  G (pi,  . . . ,  pu)  =  0. 

Proof.  (1)  Let  the  vectors  p i,  ....  ph  be  linearly  independent. 
Then  they  will  constitute  a  basis  in  their  linear  hull  Z.(pt,  ....  pk). 
An  arbitrary  vector  x  <=  L(pit  ....  ph)  may  be  written  as 

*  =  *iPi  +  •  •  •  -\-xkpk 

We  will  consider  f(x)  on  vectors  of  L(p i,  . . . ,  ph).  With  respect 
to  the  basis  pu  ....  Pi,  we  have 

k 

f(x)=  T  aHXixi 

i.  /=  I 


(even  if  the  original  space  L  is  infinite-dimensional). 

Since  f(x)  is  positive  definite  on  the  whole  space  L,  it  is  also- 
positive  definite  on  the  subspace  L(pu  . . . ,  pk)  so  that 


aw  • 

•  a\k 

ak\  • 

•  •  akk 

(I) 


Note  that  an  —  a(Pi.Pj),  whence  and  also  from  (1) 
G(ph  . ..,  pk)  =  Aft>0 


GRAM’S  DETERMINANT 


133 


§  10] 

(2)  Now  let  p\,  ....  ph  be  linearly  dependent.  Then  there  will 
be  scalars  A,i . fa,  not  all  zero,  for  which 

faPi  +  •  •  •  +  faPk  —  9 

Note  that 

a  (x,  0)  =  0 

and  substitute 

x  =  Pi,  B  =  }.,p,+  ...  -f  fapk 

into  this  identity.  Assigning  the  values  1,  . . . ,  k  to  i,  we  get  a  ho¬ 
mogeneous  system  of  k  linear  equations  in  k  unknowns: 

M(Pi>  Pi)+  •••  +faa(pu  pk)  =  0,  ) 


Ka(pk,  Pi)  -h  +faa(Pk,Pk)  —  0  ) 

This  system  definitely  has  a  nontrivial  solution  A.] . fa  and 

therefore  its  determinant  is  zero: 

G(pt,  ...,  pk)  =  0 

The  proof  of  the  theorem  is  complete. 

3.  An  important  special  case.  Within  the  framework  of  this  theo¬ 
rem,  let  us  consider  a  system  consisting  of  two  vectors  pu  p2.  We 
have 

a(p\,  pt)  a(plt  p2) 
a{p2,P\)  a(p2,p2) 

Expanding  this  determinant  and  taking  into  account  the  sym¬ 
metric  nature  of  the  bilinear  form,  we  obtain  the  inequality 

[«  (Pi>  A>)]2<  a  (pu  p^  •  a  (p2,  p2)  (2) 

which  is  called  the  Cauchy-Bunyakovsky  inequality.  Equality  oc¬ 
curs  if  and  only  if  the  vectors  Pi  and  p2  are  linearly  dependent. 

4.  Let  us  consider  the  space  of  continuous  functions  specified  on 
some  interval  t\  ^  t  ^  t2.  In  this  space  we  consider  the  quadratic 
form 

t, 

f(x)=  J  [x(t))2dt 

t, 

(in  this  connection,  see  Section  4,  Subsection  4) 

The  polar  bilinear  form  of  f(x)  is 

i, 

a  (*,  y)  =  ^  x  (t)  y  (/)  dt 

ti 


134 


LINEAR.  BILINEAR  AND  QUADRATIC  FORALS 


|CH.  IV 


It  is  readily  seen  that  f(x)  is  positive  definite.  Indeed,  if  the  con¬ 
tinuous  function  x(t)  is  not  identically  zero,  then 

t, 

J  [x(t)\2  dt  >  0 
i . 


Therefore,  in  this  case  inequality  (1)  can  be  used.  We  thus  get 
the  Cauchy-Bunyakovsky  inequality  for  integrals: 


i,  t, 

<  5  x  \  ly(t)]2dt 

t,  I, 


(3) 


Equality  in  (3)  occurs  if  and  only  if  the  system  x(t),  y(t)  is  li¬ 
nearly  dependent  or,  to  put  it  more  simply,  if  one  of  the  functions 
x(t),  y( 0  is  proportional  to  the  other  (say,  y(t)  =  Cx(t),  C  con¬ 
stant). 

This  example  shows  that  algebraic  theorems  operate  outside  the 
domain  of  algebra  proper  and  make  it  possible  to  obtain  results 
from  analysis.  The  general  basis  of  such  applications  is  the  con¬ 
struction,  in  infinite-dimensional  function  space,  of  finite-dimen¬ 
sional  linear  hulls. 


§11.  Zero  subspaces  of  a  bilinear  and  a  quadratic  form 

1.  Let  a(x,  y)  be  a  bilinear  form  given  in  a  space  L. 

Definition  1.  We  will  use  the  term  right  zero  subspace  of  the 
form  a(x,y)  for  the  set  of  all  elements  y  for  each  of  which,  given 
that  any  xeL,  the  equation 

a  (x,  y)  =  0  (a) 

holds  true.  This  definition  clearly  does  not  depend  on  the  dimension 
and  can  be  used  in  the  infinite-dimensional  case. 

We  denote  the  right  zero  subspace  by  LZ. 

In  similar  fashion  we  define  the  left  zero  subspace  L'o,  namely: 
y  e  if  (i  (//,  x)  =  0  for  any  a:  e  /.. 

2.  First  of  all  we  will  prove  that  LZ  is  indeed  a  linear  sub- 
space.  Let  // 1,  i/i <=  .  Then  a(x,  ;/,)  =  0,  a(x,  y2)  —  0  for  any  x; 
but  then  it  follows  that 

a  ( x ,  //,  +  y2)  =  a  (x,  i/i)  +  a  (x,  y2)  =  0 
ii  (x,  ay{)  =  aa  (x,  //,)  =  0 

Thus,  ;/ 1  y.j  <=  and  «//,  e  L'{\ 

In  quite'  analogous  fashion  we  prove  that  L'u  too  is  a  subspace. 


ZERO  SUBSPACES 


135 


§  III 

3.  We  now  consider  an  n-dimensional  space,  in  which  we  fix  a 
basis  £|,  . . . ,  en  and  write  down  the  bilinear  form  in  coordinates: 

,  « (*.  u)  =  Z  a  (/*,*// 

where  a,y  =  a  (eh  ey). 

We  will  show  that  i/ei;  if  and  only  if 

Z  a/;0/  =  O  (1) 

for  all  values  of  i  (i  =  1,  2 . n).  To  do  this,  we  write  down  the 

identity 

Z  altXty,  =  Z  (  Z  «,/'//)  v.  (P) 

If  the  conditions  (1)  hold,  then  so  also  does  (a).  On  the  other 
hand,  from  the  condition  (a)  it  follows  that  all  the  coefficients  of 
Xi  in  the  right  member  of  (P)  vanish,  which  is  what  gives  us  the 
system  (1). 

In  exactly  the  same  way,  y  e  L'o  if  and  only  if 

Z  a„y,  =  0  (2) 

for  all  values  of  /  (/  =  1,2,...,  n). 

Equations  (1)  and  (2)  are  systems  of  equations  defining  L" 
and  L'o  in  terms  of  coordinates. 

By  the  theorems  on  linear  systems  (see  Chapter  III),  (dimen¬ 
sion  of  Z-o)  —  (dimension  of  L'o)  =  n  —  r,  where  r  is  the  rank  of 
the  bilinear  form  a(x,  y),  that  is,  the  rank  of  its  matrix. 

From  this  it  follows  that  the  rank  of  a  bilinear  form  may  be 
determined  geometrically,  namely:  the  rank  of  the  form  a(x,  y) 
is  equal  to  the  difference  between  the  dimension  of  the  entire  space 
and  the  dimension  of  the  zero  subspace  of  this  form  (it  is  immate¬ 
rial  which  zero  subspace  —  right  or  left  —  is  taken  since  their  di¬ 
mensions  are  the  same). 

4.  Definition  2.  A  bilinear  form  is  said  to  be  nonsingular  if  the 
dimension  of  L'o  (or  L'o)  is  equal  to  zero.  In  all  other  cases  the 
bilinear  form  is  singular. 

In  other  words,  a  bilinear  form  is  singular  if  its  zero  subspaces 
have  a  nonzero  dimensionality  or,  what  is  the  same  thing,  if  its 
rank  is  less  than  the  dimension  of  the  space:  r  <  n,  or  if  the  de¬ 
terminant  of  its  matrix  is  zero:  A  =  det  =  0. 

5.  Suppose  the  bilinear  form  a(x,y)  is  singular,  that  is,  its 
rank  r  <  n.  We  introduce  into  this  space  a  special  basis  eu  e 2,  . . . , 
e„  such  that  er+i,  . . . ,  en  e  L".  To  do  this,  it  is  necessary  first  to 
choose  linearly  independent  vectors  in  L'o  (their  number  is 


LINEAR,  BILINEAR  AND  QUADRATIC  FORMS 


(CH.  IV 


precisely  equal  to  the  dimension  of  L")  and  then  complete  the 
basis  for  the  entire  space.  For  such  a  choice  of  basis,  let  us  see 
what  the  matrix  of  the  bilinear  form  will  look  like.  If  /'  = 
—  r  -f-  1 ,  ....  n,  then  by  the  definition  of  L'', 


Thus 


au  =  a(elt  e,)  =  0 


an 

•  a,r 

0  . 

.  0 

A  = 

a2 1  •  • 

■  a2r 

0  . 

.  0 

Qn\  •  • 

•  anr 

0  . 

.  0 

The  matrix  is  simplified  —  the  more  so,  the  lower  the  rank  of  the 
bilinear  form  (the  higher  the  dimensionality  of  the  zero  subspace). 

If  the  basis  is  chosen  so  that  the  last  n  —  r  basis  vectors  are  in 
the  left  zero  subspace,  then  this  too  will  lead  to  a  simplification 
of  the  matrix,  but  this  time  not  the  columns  vanish,  but  the  last 
n  —  r  rows: 


au  . 

*  •  &\n 

ar i  • 

•  Qrn 

0  . 

.  0 

0  . 

.  0 

Let  the  bilinear  form  be  symmetric.  Then  L'0  coincides  with  L" 
(prove  this).  Now  place  the  basis  vectors  er+],  en  in  L'0. 

These  same  vectors  will  be  in  L’6  and  the  matrix  will  become 
particularly  simple  in  aspect: 


«ll  • 

•  &lr 

0  . 

..  0 

art  . 

. .  arr 

0  . 

.  0 

0  . 

.  0 

0  . 

.  0 

0  . 

.  0 

0  . 

.  0 

Thus,  if  the  rank  r  of  a  symmetric  bilinear  form  is  less  than  n, 
then  its  consideration  fully  reduces  to  a  subspace  of  dimension  r 
(spanned  by  eh  .  . . ,  er). 


(>.  Now  let  us  consider  the  quadratic  form  f(x)  —  a(x,  x). 
Definition  3.  The  zero  subspace  of  a  quadratic  form  is  the  zero 
subspace  L0  of  its  polar  form  a(x,  y). 

There  is  no  need  to  distinguish  between  L'0  and  L'o  since  L'0  =  L'6. 


THE  ZERO  CONE  OF  A  QUADRATIC  FORM 


137 


§  12] 

If  the  quadratic  form  is  nonsingular,  then  its  zero  subspace  is 
zero-dimensional. 

If  the  form  is  singular,  then  its  rank  r  <  n  and  L0  has  dimen¬ 
sion  n  —  r.  Relative  to  the  basis  e, . en,  the  vectors  er+u  . . . ,  en 

of  which  are  placed  in  L0,  the  matrix  of  the  quadratic  form  is  of 
the  form  (3),  and  the  form  f(x)  may  be  considered  in  the  r-dimen- 
sional  subspace  L(et . er). 

§  12.  The  zero  cone  of  a  quadratic  form 

1.  Besides  the  given  linear  space  L  we  consider  the  affine  space 
91  assuming  that  the  elements  of  L  are  the  radius  vectors  of  points 
in  91. 

2.  Definition.  A  set  of  points  in  affine  space  is  called  a  cone 
with  vertex  O  if  together  with  every  point  Af  that  is  noncoincident 
with  the  vertex  0  the  set  contains  the  entire  straight  line  OAf 
(for  n  =  3  see  Fig.  24). 


In  certain  cases,  the  single  point  0  is  conveniently  regarded  as 
a  cone  consisting  solely  of  the  vertex.  The  simplest  cases  of  cones 
are:  any  plane  passing  through  the  point  0,  and  also  the  entire 
space  91. 

3.  Let  there  be  given  in  the  space  L  a  quadratic  form  f(x).  It 
may  be  regarded  also  in  the  affine  space  91  assuming  that  the  value 
of  f(x)  at  the  point  Af  is  defined  for  .v  =  OAf,  that  is,  is  equal  to 
f(OM). 

Denote  by  Ko  the  set  of  points  of  the  affine  space  at  which  the 
quadratic  form  f(x )  is  zero  (Af  e  Ko  if  /(OAf)  =  0) . 

Theorem  1.  The  set  Ko  is  a  cone  with  vertex  at  the  coordinate 
origin. 


138 


[.INEAR.  BILINEAR  AND  QUADRATIC  I'ORMS 


[CII.  IV 


Proof.  II  may  happen  that  Ko  consists  of  the  single  point  O 
(for  example,  if  the  space  is  real  and  the  form  f(x)  is  positive  de¬ 
finite).  Then  Ihe  assertion  is  true,  for  we  agreed  to  consider  a 
single  point  a  cone. 

Suppose  there  is  a  point  M  =#=  0  for  which 

nx)=o 

when  x  =  OM. 

Draw  through  0  and  M  a  straight  line  and  on  it  take  any  point 
M*  (Fig.  25).  Set  OM*  —  x*.  Then  x*  =  Xx,  where  X  is  a  scalar. 
Consequently, 

f  CO  =  f  0-x)  —  a  (Xx,  Xx)  —  X2a  (x,  x)  =  X2f  (x)  =  0 

Thus,  with  every  point  M  0  the  set  Ko  also  contains  all  points 
of  the  straight  line  OM  (Fig.  25). 


FI g.  25 


4.  Definition.  The  set  Ko  is  called  the  zero  cone  of  the  quadratic 
form  f(x). 

5.  Note  that,  generally  speaking,  the  cone  is  not  a  linear  sub¬ 
space:  if  f(x)  =  0,  f(y)  —  0,  then  it  may  be  that  f(x  +  //)¥=  0. 

6.  Theorem  2.  The  zero  subs  pace  of  a  quadratic  form  is  always 
part  of  the  zero  cone  of  that  form: 

L0<=Ko 

Remark.  L„  is  defined  as  a  set  of  vectors  in  linear  space,  while 
Ko  is  defined  as  a  set  of  points  in  affine  space. 

Therefore,  when  speaking  about  the  inclusion  LoCzKo,  we  have 
to  assume  that  l.»  is  a  point  set  of  the  endpoints  of  the  radius  vec¬ 
tors  of  a  zero  subspacc. 


FXAMFI.ES  OF  ZERO  CONES 


139 


§  131 

Proof.  Let  y  e  L0,  y  —  OM.  Then  a(x,y)  =  0  for  any  x.  Set 
x  =  y  to  get  a  (y,  y)  =  /' (y)  =  0.  Hence,  yeKo  in  the  sen  se  that 
M  e  Ko,  where  M  is  the  terminus  of  the  vector  y. 

§  13.  Elementary  examples  of  zero  cones  of  quadratic  forms 

1.  Let  us  consider  in  more  detail  some  particular  cases  encoun¬ 
tered  in  elementary  analytic  geometry.  We  will  assume  that  the 
quadratic  form  is  not  equal  to  zero  identically  and  has  been  re¬ 
duced  to  normal  form. 

2.  The  real  plane  (n  =  2). 

(1)  /  (x)  —  x'\  —  x?2.  Here,  r  =  2  and  L0  is  of  zero  dimension. 
Hence,  L0  consists  of  the  zero  point  alone.  The  zero  cone  Ko  is  de¬ 
fined  by  the  equation  x]  —  x-„  =  0  and  decomposes  into  two  straight 
lines:  X\  =  x2,  X\  —  — x2.  Due  to  the  small  dimension,  the  cone  is 
not  a  surface  but  is  a  line  consisting  of  two  intersecting  straight 
lines  (Fig.  26) . 


(2)  / (x)  —  x\  +  x\.  Again,  r  —  2  and  L0  is  of  dimension  0.  The 
zero  cone  is  defined  by  the  equation  x\  x-2  =  0  and  consists  of  a 
single  point.  We  sometimes  say  that  such  an  equation  defines  an 
imaginary  cone. 

(3)  f(x)=x‘\.  Here,  r  =  1  and  L0  is  of  dimension  1.  The  cone 
Ko  is  defined  by  the  equation  rj  =  0.  Hence,  Ko  consists  of  points 
for  which  X\  =  0.  It  is  readily  seen  in  this  case  that  L0  must  coin¬ 
cide  with  Ko-  Indeed,  L0  is  of  dimension  1,  and  by  the  foregoing  L0 
must  be  included  in  the  zero  cone  so  that  L0  will  be  the  sole 
straight  line  X\  =  0  that  is  contained  in  Ko-  The  cone  Ko  is  defined 
by  a  second-degree  equation.  In  the  case  at  hand  we  say  that 
every  point  of  the  axis  x2  of  Ko  has  to  be  counted  twice. 

In  Case  (3),  only  one  square,  x'j ,  participates.  This  is  because 
the  basis  vector  e2  is  placed  in  the  zero  subspace. 

3.  Three-dimensional  real  space  (n  —  3). 

(1)  f(x)  =  x]-f-x!,  —  xl-  Here,  r  —  3  and  L0  is  of  dimension 
zero.  The  zero  cone  is  defined  by  the  equation  xj  +  x'i  —  *5  =  0. 


HO 


LINEAR.  BILINEAR  AND  QUADRATIC  FORMS 


[CH.  IV 


If  the  space  is  examined  from  the  elementary  point  of  view,  with 
angles,  distances,  and  so  forth,  then  such  an  equation  defines  a 
circular  cone  with  axis  on  the  axis  X3  and  with  a  right  angle 
between  the  generatrices. 

In  this  case,  Euclidean  space  serves  as  a  model  of  a  linear 
space.  However,  one  must  bear  in  mind  that  in  linear  (and  in 
affine)  space,  angles  are  not  defined,  nor  is  there  a  rule  for 
measuring  distances,  and  for  this  reason  the  concept  of  a  “circular 
cone”  is  meaningless.  Still,  this  does  not  prevent  us  from  using 
Euclidean  space  as  a  model  for  a  linear  (or  affine)  space.  The 


supplementary  properties  of  Euclidean  space  only  help  to  make 
the  descriptions  pictorial. 

(2)  f  (*)  =  x\  +  x\  +  x°-.  Here,  r  =  3,  L0  has  dimension  zero, 
and  Ko  is  defined  by  the  equation  x\-\-  x\  +  x\  =  Q.  This  is  an 
imaginary  cone.  In  real  space  it  has  only  the  zero  point. 

(3)  /  (x)  =  x]  -j-  xrr  Here,  r  =  2  and  L0  is  of  dimension  1.  Thus, 
L0  is  a  one-dimensional  linear  subspace,  that  is,  a  straight  line 
passing  through  the  origin  of  coordinates.  The  cone  Ko  is  defined 
by  the  equation  x']  +  ^  =  0  and  consists  of  points  of  the  form 
(0,  0,  x3)\  in  other  words,  it  is  the  set  of  points  of  the  third  axis. 
Since  L0  cz  Ko,  it  is  clear  that  L0  is  the  same  straight  line  (the 
third  coordinate  axis).  The  only  thing  to  bear  in  mind  is  that  in  Ko > 
every  point  of  this  straight  line  is  counted  twice,  not  once. 

Note  that  the  third  basis  vector  e3  is  placed  in  Lo  and  so  in  the 
representation  of  the  form  everything  that  is  connected  with  the 
third  coordinate  lias  vanished. 

M)  f  (x)  =  x'-j  —  xjj.  Here,  r  =  2  and  L0  is  of  dimension  1.  The 
cone  Ko  is  defined  by  the  equation  x\  —  x\  —  Q.  The  left-hand 


r.XAMPLES  OF  ZERO  CONES 


HI 


§  13] 

member  of  this  equation  can  be  factored  into  two  linear  factors  so 
that  the  cone  Ko  consists  of  two  planes.  We  will  consider  Euclidean 
space  as  the  model  of  the  linear  space.  Then  Ko  is  depicted  in  the 
form  of  a  pair  of  planes  that  pass  through  the  axis  x3,  intersect  at 
right  angles,  and  intersect  the  plane  x3  —  0  along  the  bisectors 
of  the  quadrantal  angles  (Fig.  27). 

In  this  example,  the  subspace  L0  may  he  found  in  two  ways:  by 
computation  and  by  geometrical  reasoning.  Let  us  consider  them. 

We  write  down  the  polar  bilinear  form  and  equate  it  to  zero: 

*101  —  *202  =  0 

It  is  possible  to  find  y  =  (z/i,  02,  y3)  for  which  this  equation  holds 
true  for  any  x  =  (jci,  x2,  *3).  It  is  clear  that  0,  =  y2  =  0  and  y3 
can  assume  arbitrary  values.  Thus,  L0  coincides  with  the  third  coor¬ 
dinate  axis. 

It  is  not  possible,  directly,  to  obtain  this  result  geometrically, 
as  was  done  in  the  preceding  example.  We  know  that  Lo  is  a 
straight  line  that  passes  through  the  origin,  but  there  are  many 
such  lines  in  Ko,  and  it  is  not  possible  at  once  to  isolate  one  of 
them  as  the  space  L0. 

We  can  however  approach  this  differently.  Note  that  the  quad¬ 
ratic-form  notation  does  not  involve  the  third  coordinate.  This 
means  that  the  third  basis  vector  is  placed  in  L0.  It  then  follows, 
4>y  the  one-dimensionality  of  the  zero  subspace,  that  L0  coincides 
with  the  third  coordinate  axis. 

(5)  f(x)  —  x2r  Here,  r  =  1,  L0  has  dimension  2,  the  cone  Ko  has 
the  equation  x]  —  0  and  is  the  plane  X\  —  0  (taken  twice).  Geo¬ 
metrically,  Lo  is  that  same  plane  X\  =  0. 

4.  Remark.  We  have  just  considered  all  versions  that  may  be  en¬ 
countered  when  studying  L0  and  Ko  in  two-dimensional  and  three- 
dimensional  real  spaces.  Indeed,  an  arbitrary  quadratic  form  may 
be  reduced  to  canonical  form  and  then,  if  need  be,  multiplied  by 
—  1.  In  this  way,  the  matter  at  hand  will  reduce  to  one  of  the 
cases  considered  above. 


Chapter  V 


TENSOR  ALGEBRA 


§  1.  Reciprocal  bases.  Contravariant  and  covariant  vectors 

1.  Let  L  be  an  n-dimensional  linear  space  and  L *  the  space 
conjugate  to  it  (that  is,  the  space  of  all  linear  forms  specified  on 
L;  see  Section  1  of  Chapter  IV).  In  L  we  introduce  an  arbitrary 

basis  e\ . e„.  The  coordinates  (components)  of  an  arbitrary 

vector  x  in  L  in  the  basis  et, . . . ,  en  will  be  denoted  by  {x1, . . . ,  xn}. 
In  the  conjugate  space  we  choose  a  basis  e'(x),  . ..,  en(x)  so  that 
the  values  of  the  linear  forms  e'{x)  on  the  vectors  e >  form  a  unit 
matrix: 

«'(«/)  =  *}  0) 
where  6/  is  the  Kronecker  delta  (fi/=l  for  /  =  /  and  6/  =  0  for 

i  ^  /)• 

Definition.  The  basis  e'(x),  ...,  e’1  (x)  in  L*  that  satisfies  the 
conditions  (1)  is  said  to  be  the  reciprocal  of  the  given  basis 
. . .  in  L. 

From  the  definition  it  follows  that  for  any  basis  there  exists  a 
unique  reciprocal  basis  and  that  it  is  given  by  the  formulas 

e'2  (x)  —  0  *  a:1  +  1  -  x2  +  ...  +  0  •  xn,  I 


en  (x)  =  0  •  .v1  +  0  •  .v2  +  ...  +  1  •  xn  ) 

2.  In  the  space  L  let  us  pass  over  to  a  new  basis  e,',  ...,  en\ 
For  the  sake  of  convenience  in  notation,  we  will  write  formulas  (I) 
of  Section  5,  Chapter  II,  somewhat  differently: 

a) 

Here  and  henceforth  we  will  prime  all  indices  referring  to  a  new 
basis;  no  other  special  meaning  is  given  to  the  symbols  1',  2',..., 
so  that  I'  ^  I,  2'  =  2,  . . . ,  n'  —  n.  In  the  matrix  P,  the  upper 


§  I)  CONTRA  VARIANT  AND  COVARIANT  VECTORS  M3 

index  varies  along  each  row  and  the  lower  index  varies  down  each 
column: 


Let  an  arbitrary  vector  x  in  L  be  resolved  in  terms  of  the  old 
basis  and  the  new  basis: 

x  =  x'et  +  ...  +xnen=xre,’  +  ...  +xn'en’ 

We  will  write  formulas  (III)  of  Section  5,  Chapter  II,  expressing 
the  new  components  in  terms  of  the  old  ones  thus: 

/  =  Zq'Y  (II) 

We  use  Q  to  denote  the  matrix  of  coefficients  of  the  right-hand 
members  of  (II).  Note  that  we  have  to  regard  the  upper  index  as 
varying  according  to  column  and  the  lower  index  as  varying  ac¬ 
cording  to  row: 


O' 

Qi 

.  qV 

Qi 

Ql  .. 

•  Qa 

Ql' 

Ql'  •• 

•  Qn 

For  the  given  arrangement  of  indices,  both  in  the  matrix  P  and 
the  matrix  Q,  the  primed  index  denotes  the  number  of  the  row,  the 
unprimed  index  the  number  of  the  column.  The  equations 

Q  =  (P*)-\  p  —  (Q*)~l  (2) 

hold  true,  and  formulas  (4)  of  Section  5,  Chapter  II,  become 

£  P«  Ql  =  6/,  I  PW>  =  fi/'  (3) 

We  will  henceforth  make  frequent  use  of  relations  (2)  and  (3) 
without  stipulating  this  in  any  way. 

3.  In  the  conjugate  space  L*  we  take  a  basis  e''(.v),  ...,  e'1’  (*) 
that  is  reciprocal  to  the  new  basis  in  L.  that  is,  such  that  satisfies 
the  conditions 

e*'(er)  =  6.:  (4) 

Let  us  find  the  formulas  for  passing  from  the  basis  ek(x)  to  the 
basis  ek' (x).  Relations  of  the  form 

ek  (x)  =  X!  A\  ck  (.v) 


(5) 


TENSOR  ALGEBRA 


[CH.  V 


I'M 

with  certain  coefficients  At  definitely  hold  true.  The  matter  at 
hand  therefore  reduces  to  computing  the  coefficients  At  from  the 
given  Pr.  From  (4),  (5),  (I)  and  (1)  we  have 

•?:  -  Ov)  =  ?  K*‘  («,-)  -  ?  ajv  (X  *>;.«,) 

1.  A  t,  ft 

Hence 

£^><'  =  6*'  (6) 
a 

We  denote  the  matrix  of  the  desired  coefficients  of  the  right  sides 
of  (5)  by  A  and  assume  that  the  primed  index  denotes  the  number 
of  the  row  and  the  unprimed  index  denotes  the  number  of  the  co¬ 
lumn,  that  is,  the  variation  is  by  row.  Then  all  equations  (6)  are 
equivalent  to  a  single  matrix  equation: 

AP*  —  E  (7) 

where,  as  usual,  the  asterisk  denotes  the  transpose.  From  (7)  we 
get 

A  =  (PY'  =  Q 

Hence 

ek'  (x)  =  £  Q\'ek  (x)  (I*) 

4.  Let  an  arbitrary  linear  form  u(x),  that  is,  an  element  of  the 
space  L*,  be  resolved  in  terms  of  the  old  and  the  new  basis: 

u(x)  =  ulet(x)  +  +  unen  (x)  =  Ui>e1' (x)  +  •••  +un'en'(x). 

Let  us  now  find  the  formulas  that  express  the  new  components  of 
the  form  u(x),  that  is  the  coefficients  of  the  resolution  of  u(x)  in 

terms  of  the  basis  c1'  (x),  in  terms  of  the  old  components.  To  do 

this,  recall  that  the  coefficients  of  the  desired  formulas  constitute 
a  matrix  which  is  the  transposed  inverse  of  the  matrix  of  for¬ 
mulas  (1*).  But  inversion  and  transposition  of  the  matrix  Q 
yields  P.  Thus 

ur  =  ZP‘,ul  (II*) 

5.  We  see  that  the  formulas  (I*)  and  (II*)  are  obtained  from  the 
familiar  formulas  (1)  and  (II)  if  we  interchange  the  roles  of  mat¬ 
rices  P  and  Q. 


t>.  For  greater  pictorialness  we  give  the  following  scheme. 


CONTRAVAR1  ANT  AND  COVARIANT  VECTORS 


145 


§  U 


In  L  space 

e‘  («,)  =  6‘ 
el‘  (er)  =  6'', 

In  L’  space 

er  =  £  P'rei 

x  =  .r’e,  -f  . . .  +  xnen  e=  L 

x‘'  =  £q; 

x  =  xrel.+  ...  +xn'en.(=L 

Q  =  (PT' 

u  (*)  =  //,e'  (x)  +  ...  +»/(i)eL' 

ur  =  'Zpe“t 

u(x)  =  u/e'  (x)  +  ...  +  un,en  (x)ei' 

!  P  =  (Q‘T' 

7.  We  use  the  term  contraction  of  an  element  a  —  a'e i  -f- . . . 
+  a’’e„  in  L  with  an  element  u  (x)  =  ute'  (x)  -f  . . .  +  unen  (x)  in  L* 
to  signify  a  number  denoted  by  (a,  u)  or  (u,  a)  and  defined  by  the 
equation 

(a,  u)  =  «,a'  -f  . . .  +  unan  =  £  ukak 

A  contraction  is  obviously  an  invariant  since  it  is  nothing  but  the 
value  of  the  invariant  form  u(x)  —  u (x'  +  . . .  -f  unxn  on  the  vector 
jc  =  a  —  a'e  i  +  . . .  +  ane„. 

The  invariance  of  a  contraction  can  also  be  derived  as  a  conse¬ 
quence  of  formulas  (II)  and  (II*).  Indeed, 

*«.)(£  q,v) 

=  S  (  Z  P'k'Q’i  )  =  Z  «ua9  =  Tj 

a,  (5  V  ft'  )  a,  p  ft 

8.  It  is  clear  that  a  contraction  possesses  the  following  two  pro¬ 
perties: 

(A)  When  u  or  x  is  multiplied  by  a  scalar,  the  contraction 
(u,  x)  is  multiplied  by  that  scalar: 

(au,  x)  =  (k,  ax)  —  a  ( u ,  x) 

(B)  A  contraction  is  distributive  with  respect  to  addition: 

(u  +  u',  x)  =  («,  x)  -f  («',  x), 

(«,  x  +  x')  =  (u,  x)  -f  («,  x') 

9.  Note  the  complete  symmetry  in  the  interrelationships  of  L 
and  L*.  We  consider  the  contraction 

(«,  x)  —  // !  v 1  -f  . . .  -f  unxn 

If  here  the  element  u  in  L*  is  fixed  and  x  =  {x1,  ....  x"}  in  L 


TENSOR  ALGEBRA 


|CII.  V 


MG 


varies  in  arbitrary  fashion,  then  the  contraction  («,  a-)  is  a  linear 
form,  taken  for  u,  with  numerical  arguments  xn.  And  L 

may  be  regarded  as  a  coordinate  space.  However,  we  can  just  as 
easily  regard  u  as  an  element  of  a  coordinate  space,  namely,  the 
very  element  that  is  defined  by  the  coefficients  of  the  form  ( u ,  x), 
and  we  can  write  u  =  {u i,  ....  «„}. 

Now  suppose  that  the  element  x  =  {x\  . . . ,  *"}  in  L  is  fixed 
and  that  u  —  {uu  ...,  «„}  varies.  In  that  case,  the  contraction 
(«,  x)  is  a  linear  form  with  numerical  arguments  uu  «n.  For 
the  x  we  can  take  the  form  itself  instead  of  the  n-tuple 
of  coefficients  of  the  form.  Thus,  the  elements  x  e  L  are  interpreted 
in  exactly  the  same  way  relative  to  the  elements  uei*  as  the 
elements  u  e  L*  are  relative  to  the  elements  x  e  L.  In  other  words, 
if  the  space  L*  is  conjugate  to  L,  then  L  may  be  regarded  as  con¬ 
jugate  to  L*. 

10.  A  symmetry  in  the  interrelationships  between  L  and  L*  was 
already  perceived  above  when  passing  from  the  formulas  (I)  and 
(II)  to  the  formulas  (I*)  and  (II*)  (also  see  the  table  in  Subsec¬ 
tion  6).  One  should  bear  in  mind  that  one  of  the  spaces  L  and  L* 
(namely  L)  is  taken  for  the  original  space.  This  circumstance  will 
be  seen  to  affect  the  terminology  discussed  in  the  next  subsection. 

11.  A  transformation  by  formula  (I)  with  matrix  P  is  called  a 
transformation  by  the  covariant  law. 

A  transformation  by  formula  (II)  with  matrix  Q  is  called  a 
transformation  by  the  contravariant  law. 

In  the  given  space  L  the  components  of  every  vector  transform 
by  the  contravariant  law.  In  the  conjugate  space  the  components 
of  the  vectors  transform  by  the  covariant  law. 

Accordingly,  the  vectors  of  the  given  space  L  are  called  contra¬ 
variant  vectors  and  the  elements  of  the  conjugate  space  are  called 
covariant  vectors. 

12.  In  tensor  calculus  the  practice  is  to  use  lower  indices  in  the 
ease  of  the  covariant  law  of  transformation  and  upper  indices  for 
the  contravariant  law  of  transformation.  Accordingly,  we  used  up¬ 
per  indices  for  the  components  (coordinates)  of  vectors  in  L. 

Setting  the  indices  on  elements  of  the  matrices  P  and  Q  is  done 
so  that  the  summation  index  which  appears  twice  in  the  expression 
of  the  general  term  of  the  sum  is  a  subscript  in  one  case  and  a 
superscript  in  the  other  (see  table  in  Subsection  6).  If  under  the 
summation  sign  there  are  free  indices  on  which  summation  is  not 
performed,  then  the  same  indices  (upper  or  lower)  are  set  on  the 
quantities  obtained  by  the  summation.  These  rules  help  to  deter¬ 
mine  the  transformation  laws  of  quantities  obtained  by  the  sum- 


CONTRAVARIANT  AND  COVARIANT  VECTORS 


147 


mation.  At  the  same  time,  these  rules  require,  for  instance,  that  the 
number-labels  of  the  covariant  basis  vectors  be  indicated  by  upper 
indices, 

13.  Throughout  this  chapter  we  will  assume  that  the  bases  cho¬ 
sen  in  L  and  L*  are  reciprocal  bases.  The  basis  in  L*  reciprocal  to 
the  basis  {<?,}  e  L  will  be  denoted  by  {<?'}, thus  simplifying  the  ear¬ 
lier  designation  e‘(x).  An  arbitrary  element  in  L*  will  be  denoted 
by  u  (or  v,  and  so  forth)  in  place  of  «(x)  (or  v(x),  and  so  forth). 

14.  A  new  definition  of  a  conjugate  space.  The  concept  of  conju¬ 
gate  spaces  may  be  explained  in  a  somewhat  different  manner  so 
that  their  reciprocal  equivalence  will  be  evident  from  the  very  de¬ 
finition. 

Let  L  and  L*  be  two  linear  spaces.  For  the  sake  of  simplicity  we 
will  assume  that  they  are  finite-dimensional  and  have  the  same 
dimension  n.  Suppose  that  to  every  pair  of  elements  x  e  L,  u  e  L* 
is  associated  a  number;  we  denote  it  by  ( x ,  u)  and  call  it  the 
contraction  of  the  elements  x,  u  if  the  following  properties  hold. 

(1)  The  distributive  property  with  respect  to  each  element: 

(*>  U\  +  u2)  =  (x,  «,)  +  (x,  u2), 

(x,  +  x2,  u)  =  (x,,  u)  +  (x2,  u) 

♦ 

for  arbitrary  x,  xh  x2  e  L,  u,  U\ ,  u2  e  L* . 

(2)  The  associative  property  with  respect  to  multiplication  of 
any  element  by  a  scalar: 

(ax,  u)  —  (x,  a u)  =  a  (x,  u) 

(3)  The  property  of  nonsingularity:  if  a . .  an  are  linearly 

independent  in  L  and  (at,  u)  =  0,  ...,  (a„,  u)  =  0,  then  u  is  the 
zero  element  of  L*.  Similarly,  if  bu  . . . ,  bn  are  linearly  indepen¬ 
dent  in  L*  and  (x,  b i)=0,  ...,  (x,  6„)=0,  then  x  is  the  zero 
element  of  L. 

The  spaces  L  and  L*  may  both  be  real  or  both  complex.  Ac¬ 
cordingly,  the  scalars  are  all  real  or  complex. 

Let  us  denote  by  e,,  . . . ,  en  an  arbitrary  basis  in  L  and  by 

e1 . en  an  arbitrary  basis  in  L*.  Let  x=J]x‘efeL, 

By  properties  (1)  and  (2)  we  have 

(x,  u)  =  Yj  <‘klxiuk  (8) 

where  aki=(el,  Thus  a  contraction  is  expressed  as  the  bilinear 
form  (8).  It  is  easy  to  see  that  Property  (3),  that  is,  the  condition 
of  nonsingularity,  signifies  the  nonsingularity  of  the  bilinear  form 
(8).  It  is  also  readily  seen  that  the  contractions  (x,  u)  may  be  spe¬ 
cified  by  formula  (8)  in  different  ways  by  arbitrarily  assigning 


Nfi 


TENSOR  ALGEBRA 


|CII.  V 


(he  numbers  uf,  so  long  as  det  aJj  =#=  0.  The  conditions  (I),  (2) 
and  (3)  will  hold  true. 

The  spaces  L  and  L*  are  called  ( reciprocally )  conjugate  spaces 
if  a  contraction  is  specified  for  them  and  if  they  are  considered 
together  with  the  given  contraction.  Given  this  definition,  we  can 
now  construct  for  a  given  L  an  infinitude  of  distinct  conjugate 
spaces  L*  (to  put  it  more  precisely,  spaces  differently  conjugated 
to  L).  In  order  to  eliminate  this  indeterminacy,  we  will  define  the 
concept  of  equivalence  of  linear  spaces  differently  conjugated  to 
the  given  space  L. 

Let  us  denote  by  L*  and  L*>  two  n-dimensional  spaces  conju¬ 
gate  to  L.  We  will  say  that  they  are  equivalent  in  the  sense  of 
conjugacy  to  L  if  they  are  related  by  a  linear  isomorphism  such 
that 


(x,  u)  =  (x,  u') 


(9) 


where  x  is  an  arbitrary  element  in  L  and  u  is  an  arbitrary  element 
in  Li,  and  u'  is  the  element  of  L2  that  corresponds  to  u  via  the 
isomorphism. 

It  will  readily  be  seen  that  all  linear  spaces  conjugate  to  a  gi - 
ven  L  are  equivalent.  To  prove  this  assertion  it  suffices  to  establish 
that  if  an  arbitrary  contraction  is  given  for  L  and  L*,  then  for  any 
basis  e\,  . . . ,  eneL  there  will  be  a  unique  reciprocal  basis 

e\  . . . ,  en  in  the  space  L*.  In  other  words,  e' . en  e  L*  may  be 

found  (in  unique  fashion)  so  that  (e.,  e!i)  =  6L 
To  prove  this,  let  us  consider  an  arbitrary  basis  eh  with  the  aid 
of  which  a  contraction  is  specified  by  the  formula  (8).  We  will 
seek  the  first  vector  e1  of  the  basis  e\  . . . ,  en  in  the  form 

e 1  =  cqe1  +  a2e2  +  ...  +  ci nv‘ 


We  must  have  (e,,e')=l,  (e>,  e1)  —  0.  •••.  (<?„,  e')  =  0,  whence 
aja,  +  a]a2  +  . .  .  +  a"art  —  1 ,  > 

°2ai  +  ttla 2  +  '  '  ’  +  «2 «„  =  0.  (1Q) 

an°l  +  ana2  +  +«X=0  - 

System  (10)  is  unambiguously  solvable  since  det  akt  0.  Simi¬ 
larly,  assuming  e2  =  fre1  +  •  •  •  +  we  find  e2  from  the  con¬ 
ditions  (<?,,  e2)  =  0,  (e2,  e2)  =  1,  (e3,  e2)=0 . (e„,  e2)=0. 

Continuing  the  process,  we  find  all  the  vectors  e\  ...,  en  e  L* 
such  that  (et,  ek)  =  6f.  They  constitute  an  independent  system. 
Indeed,  let 

1  =  8  €H  L 


$  2]  TENSOR  PRODUCT  OF  LINEAR  SPACES  |49 

Contracting  the  left  and  right  sides  of  this  equation  with  the  vec¬ 
tor  eh  e  L,  we  find 

(ek,  Y,  V')  =  (ek,  6*) 

or  Xk  =  0(k  =  1,  2,  . . tt)  since  (e<,,ei)  =  6£  and  = 

Which  proves  that  for  any  basis  . . .  e  L  there  exists  a  reci¬ 

procal  basis  e',  . . . ,  en  in  the  space  L*  for  any  kind  of  specification 
of  conjugacy  between  L  and  L*. 

We  now  prove  uniqueness.  Suppose  that  for  the  given  basis 

<?i . en  e  L  there  are  two  reciprocal  bases  in  the  space  L *: 

e\  and  ei.  We  have  (ek,  ej)  =  <V  and  ’  whence 

( ek ,  e\  —  e')  =  0  for  all  k  =  1,  . . . ,  n  and  for  any  i  =  1,2,  . . . ,  n. 

From  this  and  by  the  condition  of  nonsingularity  we  find  e\ — e.'=0‘, 
or  e^  —  e‘h 

Now  suppose  x  =  x‘et  +  •  •  ■  +  xne„  e  L,  u  =  U\ex  -f  ... 

-f-  u„en  e  L*,  where  e,  and  eh  are  reciprocal  bases.  Formula  (8) 

then  assumes  the  form 

(x,  u)  =  x'u,  +  . . .  +  xnun  (11) 

At  the  same  time  we  have  proved  the  equivalence  of  all  spaces 
conjugate  to  L.  Indeed,  suppose  L\  and  Li  are  two  spaces  conjugate 
to  L,  e en  is  any  basis  in  L,  e\  . . . ,  en  is  the  reciprocal  basis 
♦n  L*,  and  (e1)',  ....  ( en )'  is  the  basis  in  L*  reciprocal  to 
eu  . . . ,  en.  We  establish  a  linear  isomorphism  between  LI  and  LI 
assuming 

«'  =  «,(*')'+  ...  +un(en)',  u's=Li 
as  the  appropriate  element  for  an  arbitrary 

u  =  ute 1  +  ...  +  unen,  ue=  L\ 

Then,  by  (11), 

(.v,  u)  =  (x,  u') 

It  is  now  clear  that  the  new  definition  of  conjugate  spaces  does  not 
differ  from  the  earlier  given  definition.  It  suffices  to  notice  that 
associated  with  an  arbitrary  element  ueL*  is  the  linear  form 

(x,  u)  =  «,*'  +  . . .  +  unxn 

where  «i,  . . . ,  un  are  coefficients  (the  constant  components  of  the 
given  vector  u  e  L*). 

§  2.  Tensor  product  of  linear  spaces 

1.  Suppose  we  have  two  linear  spaces  L  and  £,  both  real  or  both 
complex  (possibly  infinite-dimensional).  Using  vectors  taken  from 
L  and  L,  we  will  construct  some  new  entities  whose  set  will  be 
denoted  by  T. 


i  no 


TENSOR  ALGEBRA 


(CH.  V 


First  of  all,  for  elements  of  T  we  will  take  all  possible  pairs  of 
vectors  ab,  where  a  <=  £,  b  e  £.  Besides,  there  will  also  be  ele¬ 
ments  of  T  consisting  of  all  possible  finite  sets  of  such  pairs.  There 
will  be  no  other  elements  in  T.  In  other  words,  any  element  t  e  T 
is  of  the  form 

t  =  {axbu  akbk)  (1) 

where  au  •  •  • ,  ah  e  L,  bu  .  . . ,  bh  e  £.  This  equation  has  only  one 
meaning:  that  an  element,  denoted  by  t  of  the  set  T  is  the  k-tuple 
( complex )  of  pairs  axbx,  ..., 

We  agree  always  to  write  the  element  of  L  in  the  first  position 
of  a  pair.  If  L  and  £  coincide,  then  the  pairs  of  vectors  that  make 
up  the  elements  of  T  are  considered  to  be  ordered,  that  is,  the  or¬ 
der  of  the  vectors  constituting  a  pair  is  essential.  Thus,  in  the 
case  L  =  £,  a  e  £,  b  e  £,  we  have,  generally,  ab  ba. 

2.  For  what  follows  it  will  be  more  convenient  to  call  the  pair 
ab  a  product  of  a  by  b  and  in  place  of  the  words  “sets  of  pairs” 
or  “fe-tuples  of  pairs”  to  use  the  word  “sum”.  Accordingly,  in  place 
of  (1)  we  will  write 

t==axbt  +  ...  +akbk  O') 

where  a, . ah  e  L,  bx,  . . . ,  bu  e  £.  Very  often,  the  pair  ab  is 

termed  the  symbolic  product  of  a  by  b,  and  the  sum  (1')  is  called 
the  symbolic  sum.  It  will  be  apparent  later  on  that  this  arithme¬ 
tical  terminology  is  fairly  well  justified. 

3.  We  now  introduce  three  equivalence  conditions  for  the  set  T, 
that  is,  conditions  under  which  certain  elements  of  T  are  said  to 
be  equal,  namely: 

(1)  the  symbolic  sum  does  not  depend  on  the  order  of  the  terms, 

(2)  (a  4-  b)c  —  ac  +  be, 

whore  a,  b  are  any  vectors  in  L  and  c  is  any  vector  in  £.  Simi¬ 
larly, 


a  (b  -f  c)  —  ub  ac,  where  a  e  L  and  b,  c  e  £, 

(3)  (a a)b=  a  (a b)  where  a,  b  are  arbitrary  vectors  taken  from  L 
and  L,  respectively,  and  a  is  any  scalar  (real  if  L  and  £  are  real 
spaces,  and  complex  if  these  spaces  are  complex). 

Remark.  We  have  not  yet  mentioned  another  condition  of  equi¬ 
valence  that  has  been  taken  for  granted  and  should  have  been 
stated  first:  that  under  an  admissible  replacement  of  the  vectors 

. . . at,,  b . 6ft  the  element  t  is  subjected  to  an  admissible 

replacement,  which  means  it  is  carried  into  itself. 


TENSOR  PRODUCT  OF  LINEAR  SPACES 


151 


§  21 

4.  In  other  words,  the  conditions  stated  in  Subsection  3  mean 
that  admissible  replacements  of  the  element 

t  —  a^b,- f-  ...  -\-akbc^T  (2) 

are: 

(a)  any  change  in  the  order  of  the  pairs  axb{,  ahbh  in  the 
sum  (2): 

(b)  replacement  of  a  pair  by  a  sum  of  pairs  or  the  replacement 
of  a  sum  of  pairs  by  one  pair  in  accordance  with  (2)  of  Subsec¬ 
tion  3;  for  example,  if  ax=a\- f  a",  then  the  pair  axb\  in  t  may  be 
replaced  by  the  sum  a\b ,  +  a"bx\ 

(c)  the  transfer  of  a  numerical  factor  from  one  vector  of  a  given 
pair  to  the  other  vector  of  that  pair. 

At  the  same  time,  two  elements  t\,  ti^.T  are  said  to  be  equal 
if  and  only  if,  by  means  of  a  finite  number  of  admissible  replace¬ 
ments,  they  can  be  reduced  to  one  and  the  same  set  of  pairs  of 
elements  taken  from  L  and  £. 

5.  We  now  introduce  linear  operations  into  the  set  T. 

(1)  The  sum  of  two  elements  of  the  set  T, 

i  —  axbx  +  . . .  +  akbk, 
t'  —  a\b\  Cl'mb'm 

i£  an  element  of  the  set  which  constitutes  a  complex  of  pairs  of  the 
element  t  combined  with  a  complex  of  pairs  of  the  element  t'\ 

t  “T  t'  —  -(-...  +  akbk  a'\b\  T-  •  •  •  "T  Cl'mb'm 

(2)  The  product  of  an  element  /  by  a  scalar  a  is  defined  by  the 
equation 

at  =  (aa,)  6,  +  ...  +  (a ak)  b k 

By  Subsection  4,  the  sum  t  +  ('  and  the  product  at  are  invariant 
to  admissible  replacements  of  the  elements  t  and 

We  now  prove  that  the  set  T  together  with  such  linear  opera¬ 
tions  constitutes  a  linear  space. 

First  note  that  the  axioms  (1),  (2)  and  (5) - (8)  of  a  linear  space 
obviously  hold  true  for  T  because  of  the  definition  of  linear  opera¬ 
tions  just  given  and  due  to  the  conditions  (1)  and  (3)  of  Subsec¬ 
tion  3.  It  remains  to  verify  axioms  (3)  and  (4). 

To  verify  axiom  (3),  we  have  to  find  a  zero  element  in  T.  We 
will  show  that  the  zero  in  T  is  the  pair  Otl,  where  0  is  the  zero  ele¬ 
ment  of  L  and  0  is  the  zero  element  of  L.  Let  us  first  establish 
that  no  matter  what  the  element  />  e  £  we  have  00=  06  (simi¬ 
larly,  00  =  ci0for  any  ae£).  Indeed,  by  condition  (3)  of  Subsec 
tjon  3, 


152 


TENSOR  ALGEBRA 


[CH.  V 


whence 

ab  +  00  =  ab  +  66  =  (a  +  0)  b  +  ab 
Finally,  if  t  =  atbt  +  ...  -\-akbk  is  any  element  in  T,  then 
t  4-  80  =  ai&i  +  ...  +  (cikbk  +  86) 

—  ai^i+  ••• 

This  establishes  that  the  third  axiom  of  a  linear  space  holds  true 
for  T. 

The  fourth  axiom  is  easy  to  verify.  Namely,  for  any  (eF  the 
additive  inverse  is  (-l)-t.  Indeed, 

/  +  ( — 1)  •  t  =  a{b i  +  . . .  4*  o.kbk  +  ( — 1)  {d\b{  4-  ...  4-  &kbk) 

—  (a,  +  ( — 1)  •  a,)  6,  +  ...  +  (a*  +  ( — 1)  •  ak)  bk 
=  e  •  6,  +  ...  +  e  •  bk  =  90  +  ...  4-  06  =  00 


This  completes  the  proof  of  our  assertion  concerning  the  set  T. 

6.  Definition.  The  linear  space  T,  taking  into  account  the  con¬ 
struction  of  its  elements  in  the  form  of  sums  of  products  of  ele¬ 
ments  of  L  and  £,  is  termed  a  tensor  product  of  the  space  L  by  L. 
In  symbolic  notation,  we  have 

T==L<S>  l 

The  elements  of  the  space  T  regarded  as  sums  of  products  of 
elements  taken  from  L  and  £  are  called  tensors  over  the  spaces  £ 
and  £. 

7.  Besides  the  spaces  £  and  £,  let  us  consider  their  conjugate 
spaces  £*  and  £*  and  let  us  introduce  one  more  operation,  called 
the  contraction  of  elements  taken  from  T  with  elements  from  £* 
and  from  £*. 

For  an  element  of  T  let  us  first  of  all  take  one  pair  ab,  a  e  £, 
b  e  £.  Let  u  e  £*.  We  denote  the  contraction  of  the  pair  ab  with 
respect  to  the  second  (right)  element  with  the  element  n  (right 
contraction)  by  (ab,  u)  and  we  define  it  by  the  equation 

(ab,  u)  =  a(b,  u)  (3) 

Here,  ( b ,  u)  is  the  contraction  of  the  element  6e£  with  the  ele¬ 
ment  u  e  £*,  as  understood  in  the  sense  of  Subsection  7,  Section  1 
of  tli is  chapter.  Since  (b,  u)  is  a  number,  the  contraction  (3)  is  a 
vector  collitiear  with  a,  that  is,  a  vector  in  £. 

The  contraction  of  the  pair  ab  with  respect  to  its  right  (left) 
element  with  the  element  v  e  L*  (left  contraction)  is  denoted  and 


BASIS  IN  A  TENSOR  PRODUCT 


153 


§  31 

defined  by  the  equation 

(u,  ab)  —  (v,  a)  b 

This  is  a  vector  of  £  collinear  with  the  vector  b. 

We  determine  the  contraction  of  the  element  t  —  a\b\  +  . . .  + 
-f-  ai,bh  e  T  (a  right  contraction,  for  example)  with  the  element 
u  e  £*  termwise: 


(a,6,  +  . . .  +  akbk,  u)  =  a,  (&,,  u)  +  ...  -f  ak  (bk,  u) 


8.  The  contraction  of  the  element  t  e  £  aif/A  the  element  v  e  £* 
or  with  the  element  u  e  £*  is  invariant  to  admissible  replacements 
of  the  element  t. 

Proof.  By  the  definition  of  a  contraction  and  due  to  Subsec¬ 
tion  7,  Section  1,  Chapter  V,  the  contraction  is  distributive  with 
respect  to  addition  of  elements  taken  from  T,  L  and  £;  numerical 
factors  can  be  taken  outside  the  contraction  symbol.  For  this 
reason,  an  admissible  replacement  of  the  element  /  implies  also  an 
admissible  replacement  of  its  contraction  witli  u  or  v. 

Corollary.  If  two  elements  t2^T  are  equal,  then  the  contrac¬ 
tions  of  1 1  and  t2  with  respect  to  the  right  elements  of  their  pairs 
with  one  and  the  same  element  u  e  £*  are  also  equal  ( this  asser¬ 
tion  naturally  holds  true  also  for  contractions  relative  to  the  left 
eldhients). 

§  3.  Basis  in  a  tensor  product.  Components  of  a  tensor 

1.  Now  let  L  and  £  be  finite-dimensional;  denote  their  dimen¬ 
sions  by  n  and  m  respectively.  Let  . . .  en  be  a  basis  in  £  and 

e . .  em  a  basis  in  £.  Consider  £  =  £©£. 

Lemma.  If 

C|fl|  +  e,d2  -f-  ...  +  endn  —  00  (1) 

where  a,-  e  £,  then 

5,  —  d2—  . . .  —  a„  =  0  (2) 

Proof.  In  L*  we  consider  the  basis  e' . e”,  which  is  recipro¬ 

cal  to  the  given  basis  in  L.  Take  the  left  contraction  of  equation 
(1)  with  the  vector  e'.  On  the  basis  of  Subsection  7,  Section  2,  we 
get  the  following  equation: 

(e1,  <?,)  a,  +(e‘,  e2)d2+  ...  +  ( e ',  ej  att  =  (e\  0)  0  =  0  •  0  =  0 
whence 

1  *  U\  0  •  5 2  0  •  (ln  =  0 


TENSOR  ALGEBRA 


[CM.  V 


I5| 


Consequently,  «i  =  U.  Similarly  we  find  the  remaining  equations 
(2)  by  contracting  (1)  with  e2,  ....  e".  This  completes  the  proof 
of  the  lemma. 

2.  Theorem  1.  All  pairs  e^j  are  linearly  independent  in  the 
space  T. 

Proof.  Suppose  we  have  the  relation 

Z  <*,/<?,£,  =  60  (3) 

>.  i 

where  a,j  are  certain  scalars.  Equation  (3)  may  be  written  as 

Z*i(Z  =  90 

From  this  and  from  the  preceding  lemma  it  follows  that 

Z  aue,  =  6  (4) 

for  every  number  i.  And  since  e ,  are  vectors  of  the  basis,  it  follows 
from  (4)  that  afj  =  0.  Theorem  1  is  proved. 

Theorem  2.  The  pairs  e^j  form  a  basis  in  the  space  T. 

Proof.  Let  t  e  T  and  we  have  t  =  aibl  -f-  . . .  -\-ahbh.  Decom¬ 

pose  the  vectors  ait  . . . ,  ah  e  L  in  terms  of  the  basis  eh  . . . ,  e„, 

and  the  vectors  bt . bh  e  L  in  terms  of  the  basis  e{ . em. 

Then  after  grouping  terms  we  get 

n  m 

t  =  Z  Z  (5) 

(=i  /=  i 

where  x'-'  are  certain  numerical  coefficients.  By  (5),  any  element 
in  T  is  linearly  expressible  in  terms  of  the  pair  from  this  and 
from  Theorem  1  follows  Theorem  2. 

Corollary.  If  L  and  L  have  dimensions  n  and  m,  then  the  tensor 
product  T  —  L  <%>  L  is  finite-dimensional  and  has  dimension  nm. 

3.  We  will  now  specially  consider  three  cases  of  a  tensor  pro¬ 
duct  of  two  spaces  when  either  the  two  spaces  coincide  or  one  of 
them  is  conjugate  to  the  other. 

Let  L  be  a  space  of  dimension  n  and  L*  the  space  conjugate  to 
it.  Let  eu  . . . ,  en  be  a  basis  in  L  and  e\  . . . ,  e"  the  reciprocal  basis 
in  ^ 

(1)  The  tensor  product  of  L  by  L  will  be  denoted  by  To.  By 
Subsection  2,  any  element  /e7’S  =  L®L  has  the  decomposi¬ 
tion 


t  =  Z 


(6) 


§  3] 


BASIS  IN  A  TENSOR  PRODUCT 


155 


The  elements  of  the  product  Tl  —  L<g>L  are  called  contravariant 
tensors  of  order  two  over  the  space  L. 

(2)  We  denote  the  product  L*  by  L*  by  T 2.  The  elements  of 
this  product  are  termed  covariant  tensors  of  order  two  over  L.  For 
any  /  e  Tl  —  L*  ®  L *  we  have 

t  =  £  tiie1?1  (7) 

(3)  Denote  the  product  L  by  L*  by  T j.  Its  elements  are  mixed 
tensors  of  order  two  over  L.  For  every  t  e  T\  —  L  ®  L*  we  have 

t='Z‘*iieiel  (8) 

Remark.  The  elements  of  the  spaces  L  and  L*  themselves,  that 
is,  the  contravariant  and  covariant  vectors  are  also  called  tensors 
of  order  one  (contravariant  and  covariant,  respectively). 

4.  The  coefficients  of  the  expansions  (6),  (7),  (8)  are  called 
components  of  their  tensors  in  the  basis  e\,  •  •  ■ ,  e„  of  space  L. 
They  are  indicated  by  upper  or  lower  indices  depending  on  the 
structure  of  the  tensor.  How  precisely  is  seen  in  (6),  (7),  (8). 

Remark.  Tensors  are  defined  by  components,  and  so  when  we 
say  “given  a  tensor”,  we  write  the  components,  for  instance  xi} 
(just  as  in  analytic  geometry  we  say  given  a  point  ( x ,  y)). 

5.  The  elements  of  an  arbitrary  /iXn  square  matrix  may  be 
taken  for  the  components  of  a  tensor  with  respect  to  a  given  basis. 
Their  specification  in  the  form  of  an  array  (that  is  to  say,  in 
accordance  with  the  indices)  will  always  define  a  certain  tensor 
in  T 0  and  also  in  Tl,  and  in  T !.  When  passing  to  a  new  basis, 
the  tensor  components  transform  according  to  special  laws  that 
correspond  to  T\,  T'l  and  T1!.  Let  us  find  these  laws. 

6.  For  tensor  components  in  Tl  =  L  ®  L  we  have  the  contrava¬ 
riant  law  of  transformation  on  both  indices. 

This  means  that  if  in  space  L  we  pass  to  a  new  basis 

er^ZPre,  (9) 

then  the  new  components  of  the  tensor  t  <=  T\  will  be  expressed 
in  terms  of  the  old  components  by  the  formulas 

xrr  =Z  Q\ 

i.  1 

Proof.  The  inverses  of  formulas  (9)  arc 

ei  =  Z  Qi  ei’ 


TENSOR  ALGEBRA 


[CH.  V 


15(i 

whence 

t  =  E  t'V/  =  Z  ( y  E  Qi'er  Z  Q/V) 

=  Z(Zxi/Qi'Q/>'e''  do) 

On  the  other  hand, 

/  =  E,ci''V/'  (ii) 

Comparing  (10)  and  (11),  we  get  (1)  which  completes  the  proof. 

7.  The  transformation  law  (1)  was  derived  as  a  consequence  of 
the  invariance  of  tensors  /  e  77,  relative  to  a  choice  of  basis  in  the 
space.  We  made  use  of  this  invariance  when  we  compared  (10) 
and  (11).  Contrariwise,  the  invariance  of  the  tensors  /eTo  fol¬ 
lows  from  (I).  In  detail  we  have:  if  in  the  basis  ely  ...,  en  e  L 
are  arbitrarily  given  the  numbers  (i,  j  —  1,2,  ...,  n)  and  if 
when  passing  to  another  basis  er,  ....  fn-eiby  formulas  (9) 
they  are  replaced  by  the  numbers  x1'1'  in  accordance  with  (I), 
then 

E  =  E 

Proof.  Using  (9)  and  (I),  we  get 

E  xri'ercr  =  E(Z  t“PQuQO(E  Pl'edi  Z  /*«,) 

=  Zx“P(ZQ^;0(ZQ^)^ 

=  Z  T "|,6a6^1e/  =  E 

8.  For  the  components  of  tensors  in  f"  —  L*  ®  Z.*  the  covariant 
law  of  transformation  on  both  indices  is  valid: 

T rr  =  Z  x,iPrPi‘  (II) 

For  components  of  tensors  in  T\  =  L(&  L*  we  have  the  contrava- 
riant  transformation  law  on  the  upper  index  and  the  covariant  law 
on  the  lower  index: 

Tr  =  Z  x'/Q'i  P1/'  (III) 

Both  formulas  (II)  and  (III)  are  derived  from  the  invariance  of 
tensors  in  Toandr!  relative  to  the  choice  of  basis  in  L,  exactly 
like  formula  (1).  Here,  to  derive  formula  (II)  we  have  to  use  the 
familiar  equations 

e1'  =  Z  QrV 

in  place  of  (9).  To  derive  (III),  use  both  (9)  and  (12). 


(12) 


BASIS  IN  A  TENSOR  PRODUCT 


157 


§  3] 

Remark.  In  turn,  the  invariance  of  tensors  in  T%  and  T !,  in  the 
sense  that  it  is  explained  in  Subsection  7  for  To,  follows  from  (II) 
and  (III). 

9.  Linear  operations  on  tensors  in  Tl,  To  and  T”!  are  expressed 
in  terms  of  components  by  the  usual  rules. 

(1)  When  tensors  are  added,  their  components  are  added.  For 
example,  if 

t  =  Z  <=  Tl,  s  =  Z  e  To 

then 

t  +  s  =  Z  (x‘J  +  o‘J)  £;£/  e  To 

Quite  naturally,  the  addition  of  tensors  of  different  types  (struc¬ 
tures)  is  not  defined.  If  we  added  their  components,  the  result 
would  not  be  invariant. 

(2)  In  multiplying  a  tensor  by  a  scalar,  all  its  components  are 
multiplied  by  that  scalar;  for  example,  for  t^To, 

at  =  Z  ax i,eiel 

•  10.  The  expression  of  a  contraction  in  terms  of  components  re¬ 
quires  a  somewhat  more  detailed  explanation.  Suppose  we  have  a 
second-order  contravariant  tensor  t  =  Z  e  To  and  a  cova¬ 
riant  vector  u  =  Z  uie‘  s  L*.  Let  us  for  example  consider  the 
right  contraction  t  with  the  vector  u.  We  have 

(L  u)  =  (Z  t ileieh  Z  «//)  =  Z  xl,ukel  {eh  ek) 

=  Z  *'W.6/  =  Z  (Z  *'*«*)  <?, 

Thus,  as  a  result  of  this  contraction  we  obtain  a  certain  vector 
Z  v'e,  ^  L  whose  components  are  found  by  summing  over 
the  second  index  of  x'j: 

xl=Zr  % 

Similarly,  in  the  case  of  a  left  contraction 

(u,  t )  =  Z  (Z  ikJUk)e, 

we  obtain  a  vector  (/=Z//e;e/-  whose  components  are  found 
by  summing  over  the  first  index  of 

//=Zt% 


158 


TENSOR  ALGEBRA 


ICU.  V 


11.  If  /=^Tf/eVer5  and  xkek^  L,  then  the  right 

contraction 

(t,  *)=  Z  (Z  *<>«*)  e< 

This  is  the  vector  u—  Z  uie>  e  /■>*,  that  is,  a  vector  of  the  space 
conjugate  to  L.  Its  components  are  found  by  summing  over  the 
second  index  of  t,> 

«r  =  Z  ***** 

The  left  contraction  is  also  a  vector  in  L*  and  reduces  to  summa¬ 
tion  over  the  first  index  of  t,> 

(*.  0=  Z(Z  **/**)  e' 

12.  In  case  /  =  Z  T;-e,e/  e  T\  a  contraction  is  also  possible 
with  the  vector  x—  X  xkek^.L  and  with  the  vector u—  Z  ukek^L*. 
Namely, 

(t,  *)  =  (Z  T/e,c,>  Z  **e*)  =  Z  */**<?<  G?'.  ek ) 

=  Z(Zxlv% 

Similarly 

(<.  «)  =(Z  T/e,c',  Z  «***)  =  Z  */«*(<?/.  <?*)<?'=  Z  (Z  */«*V 

Thus,  if  x  e  L,  then  (t,  x)e  L.  If  u  e  L*,  then  (/,  «)e  L*. 


13.  For  /gT'  we  have  what  is  called  an  inner  contraction, 
which  consists  in  the  replacement  of  every  pair  a,6j  in  /  =  ai£>i  + 
+  ...  -f-  dkbk  (a,-  e  L,  6,  e  Z.*)  by  the  contraction  (a,-,  &,■).  This 
definition  is  not  connected  with  the  choice  of  basis,  and  for  this 
reason  the  inner  contraction  of  a  mixed  tensor  t  is  an  invariant 
number  dependent  solely  on  the  choice  of  the  element  i  in  the 
space  T\. 

If  we  denote  the  inner  contraction  of  t  by  (/),  then,  in  an  ar¬ 
bitrary  basis,  we  have 

(/)  =  Z  */  (<>i’  e')  =  Z  XW<  =  Z  T*  =  t|  4-  +  ...  +4! 


whence  it  is  also  possible  to  derive  the  invariance  of  an  inner  con¬ 
traction  as  a  consequence  of  formula  (III).  Indeed,  from  (III)  we 
have 


Z*/(ZQi  >*')  = 


Z  =  Z4 


14.  We  see  that  in  all  cases  a  contraction  reduces,  in  compo¬ 
nents,  to  summation  over  one  contravariant  (upper)  index  and 
over  one  eovariant  (lower)  index.  In  this  process,  the  total  order, 
that  is  the  total  number  of  indices  indicating  the  tensor  compo- 


TENSORS  OF  niFINEAR  FORMS 


159 


§  4| 

nenls,  is  reduced  by  two.  When  one  index  remains,  the  result  of 
contraction  is  a  tensor  of  order  one  (a  vector  of  L  or  of  L*).  For 
example,  the  contraction  Y.xkJ‘lk=xJ  leaves  one  free  (upper) 
index  and  yields  the  vector  XI  K>ei  *=  L.  In  cases  where  no  free 
indices  remain  (as  in  Subsection  13)  a  numerical  invariant  results. 
For  this  reason,  numerical  invariants  arc  often  called  tensors  of 
order  zero. 

§  4.  Tensors  of  bilinear  forms 

1.  Suppose  in  n-dimensiona!  linear  space  L  we  have  an  inva¬ 
riant  bilinear  form  a(x,  y) ,  x.  y  <=  L.  If  a  basis  et,  . . . ,  en  is  spe¬ 
cified  in  L,  then  the  form  a(x,  y)  in  this  basis  has  the  coordinate 
(component)  representation 

a  (x,  //)  =  X  «</*'.'/'  (1) 

Pass  to  a  new  basis  in  L: 

er  =  X  P\  ei 

In  the  new  basis,  the  form  a(x,  y)  receives  a  different  component 
representation  with  new  coefficients  ary.  By  Section  3,  Chap¬ 
ter  IV, 

ayy  =  X  aijP'yPy 

Thus,  the  coefficients  of  the  bilinear  form  in  L  transform  by  the 
covariant  law  for  each  index  (see  (II),  Subsection  8,  Section  3). 
Therefore,  we  can  associate  to  the  form  a(x,  y)  a  tensor  from 
Ti  namely 

a  =  X  atlelei  (2) 

It  is  called  the  tensor  of  the  given  bilinear  form.  From  Subsec¬ 
tion  7,  Section  3,  it  follows  that  the  tensor  a  is  associated  with 
the  form  a(x,  y)  invariantly  (that  is  to  say,  it  is  the  same,  irres¬ 
pective  of  the  choice  of  basis  in  the  space  L). 

Conversely,  to  any  tensor  (2)  in  Ti  there  corresponds  a  bilinear 
form  in  L.  Indeed,  if  we  perform  a  left  contraction  (2)  with  the 
vector  x=  X  T  and  then  contract  the  thus  found  (in  L*) 

vector  with  the  vector  y  =  X  !l'ei  e  the  result  will  be  the  right 
member  of  (1).  Thus,  tensor  a  is  associated  with  the  bilinear  form 

a  (*,  y)  =  ((*,  a),  ;/)  =  X  aijxiyJ  (3) 

The  invariance  of  such  a  construction  of  a  bilinear  form  in  L  via 
a  preassigned  tensor  in  Ti  is  obvious,  since  a  contraction  is  an 
invariant. 


ir,n 


TENSOR  ALGEBRA 


[CH.  V 


2.  In  order  to  establish  a  similar  relationship  with  the  theory 
of  bilinear  forms  for  tensors  in  Tl  and  T '  it  is  necessary  to  con¬ 
sider  bilinear  forms  of  two  covariant  vector  arguments  and  bilinear 
forms  one  of  whose  vector  arguments  is  contravariant  and  the 
other  is  covariant.  Both  forms  are  defined  as  numerical-valued 
functions  linear  in  each  argument.  Besides  we  must  demand  their 
invariance,  that  is  to  say,  that  their  numerical  values  be  indepen¬ 
dent  of  any  choice  of  basis  (see  Subsections  3  and  4  below). 

3.  The  bilinear  form  a(u,  u)  with  two  covariant  arguments 
//=  £  i^e'  e  L*  and  v—  £  u,e‘  e  L*  has  the  component  represen¬ 
tation 

« (»■  w)  =  £  7 

with  coefficients 

a 11  —  a  (e\  e1) 

from  which  and  from  (I*),  Subsection  3,  Section  I, 

a‘  r  =  a  (e‘  ,  e1  )  =  £  a  ( e  ,  e1)  Q\  Q)  =  £  a‘'Qt  Qi' 

Thus,  the  coefficients  of  the  form  a(u,  v)  transform  by  the  contra¬ 
variant  law  for  every  index  (see  (I),  Subsection  6,  Section  3).  The¬ 
refore,  to  the  form  a(u,  v)  is  invariantly  associated  a  tensor  of 
To,  namely, 

a  =  £ 

Conversely,  to  any  preassigned  tensor  aefo  there  corresponds, 
in  the  form  of  a  contraction,  the  invariant  bilinear  form 

a  (u,  v)  =  ((«,  a),  v)  =  £  aliuivl 

4.  For  a  bilinear  form  a(x,  u)  with  two  distinct  arguments 
x  =  £  x‘ei  *=  L,  u  —  Y  uie‘  e  L*  we  have 

a{x,  u)  =  Y  nWiij 

where 

d[  =  a{ei,  e<) 

whence 

a!'  =  a(er,  er)=  £  a(eit  e1)  PrQ1,’  =  £  a'PlQ1,' 

Thus  the  coefficients  of  the  form  a{x,  u)  transform  by  the  cova¬ 
riant  law  for  the  lower  index  and  by  the  contravariant  law  for  the 
upper  index.  Therefore,  the  form  a(x,  u )  is  invariantly  associated 
with  a  tensor  from  7’!,  namely 

a  =  £  a[ele, 


§  4)  TENSORS  OF  BILINEAR  FORMS  J6l 

Conversely,  to  every  prespecified  tensor  aeTi  there  corresponds 
a  bilinear  form 

a  (ac,  u)  —  ((«,  x),  u)  =  £  aix'ti, 

5.  Note  that  the  orders  of  the  tensor  of  a  form  are  opposite  to 
those  of  its  arguments.  For  example,  if  a  certain  argument  of  a 
form  is  covariant,  then  the  corresponding  index  of  the  tensor  of 
that  form  is  contravariant  (upper). 

6.  The  formulas  (2)  and  (3)  of  Subsection  1  establish  a  one-to- 
one  correspondence  between  bilinear  forms  in  L  and  tensors  in  T’-l 
This  correspondence  is  obviously  an  isomorphism  relative  to  linear 
operations.  Thus,  in  the  sense  of  linear  algebra  the  theory  of  ten¬ 
sors  in  Tl  is  equivalent  to  the  theory  of  forms  a(x,  y )  in  L.  The 
same  may  be  said  with  respect  to  the  theory  of  tensors  in  To  and 
in  T\  and  the  theory  of  forms  a(u,  v)  and  a(x,  u). 

However,  the  construction  of  a  separate  theory  of  tensors  (besi¬ 
des  the  theory  of  forms)  is  necessary.  Firstly,  the  contraction  ope¬ 
ration  does  not  fit  into  the  framework  of  the  theory  of  forms.  Se¬ 
condly,  tensors  may  be  correlated  not  only  with  forms  but  with 
many  other  entities  of  algebra  and  geometry  (and  also  mechanics 
and  physics).  This  correlation  makes  it  possible  first  of  all  by 
general  methods  to  construct  invariants  of  the  entities  under  study 
(mostly  in  the  form  of  contractions).  Besides,  there  thus  appears 
the  possibility  of  expressing  relationships  between  entities  in  the 
form  of  tensor  equations,  that  is,  as  equalities  between  tensors.  An 
important  peculiarity  of  tensor  equations  is  their  invariance. 

7.  Suppose,  for  example,  we  have  the  equation 

t"  =  0  (.) 

What  it  means  is  that,  relative  to  a  given  basis,  all  components 
of  a  certain  tensor  in  Tl  are  zero.  But  then  the  components  of 
this  tensor  are  zero  in  any  other  basis  as  well.  Arithmetically,  this 
is  evident  from  formula  (I),  Subsection  6,  Section  3,  but  in  ac¬ 
tuality  it  follows  directly  from  the  very  definition  of  tensors  of  To 
as  invariant  objects  (equation  (4)  expresses  the  invariant  fact  that 
the  tensor  r  =  £  is  the  zero  element  of  Tl)-  Quite  natu¬ 

rally,  of  course,  the  same  may  be  said  of  an  equation  of  the  form 
Tu  --  0  and  T‘  =  0. 

Because  of  the  invariance  of  tensor  equations,  their  validity  can 
be  proved  merely  by  a  verification  relative  to  some  one  (conve¬ 
nient)  basis.  This  simple  bit  of  reasoning  will  be  made  frequent 
use  of  in  what  follows. 


(j-U0I 


lf.2 


TENSOR  ALGEBRA 


[CM.  V 


8.  Wc  give  here  a  test  for  distinguishing  tensors  of  order  two. 

Suppose  an  object  A  is  defined  relative  to  some  basis  e\ . en 

of  the  space  L  with  components  a,/,  but  it  is  not  known  how  the 
components  change  in  passing  to  another  basis.  The  following 
proposition  holds. 

If  a  contraction  of  the  components  aih  over  some  index  with  the 
components  of  any  contravariant  vector  always  has  a  covariant 
transformation  law  relative  to  the  remaining  free  index,  then  the 
components  ajh  transform  by  the  covariant  law  for  each  index. 

Proof.  We  carry  out  the  proof  for  a  contraction  over  the  first 
index.  Let  x—YjX't ?,•  be  an  arbitrary  contravariant  vector  (that 
is,  x  e  L) .  We  consider  the  contraction 

bk  =  Z  aikxl 

By  hypothesis,  we  can  regard  bu  as  the  components  of  some  vector 
b  <=  L*  (relative  to  the  basis  e ',...,  en).  Let  us  now  take  in  L 
another,  also  arbitrary,  vector  y—YL  ykek ■  Then 

(b,  y)=Y  bkyk  =  Y  aikxlyk  (5) 

is  an  invariant.  Consequently,  the  right  member  of  (5)  is  a  compo¬ 
nent  representation  of  an  invariant  bilinear  form.  Using  this  fact 
and  Subsection  1,  we  complete  the  proof  of  our  proposition. 

Remark.  By  the  proposition  just  proved,  the  object  A  is  in- 
variantly  correlated  with  a  tensor  from  T2-  Accordingly,  this  pro¬ 
position  may  be  taken  as  a  test  for  covariant  tensors  of  order  two. 

9.  Similarly,  if 

y--2 

is  a  contravariant  vector  for  any  choice  of  a  covariant  vector  U{, 
then  aih  is  a  contravariant  tensor  of  order  two.  If 

yk  =  £  aW 

is  a  contravariant  vector  for  any  choice  of  the  contravariant  vec¬ 
tor  a",  then  (if  is  a  mixed  tensor. 

Both  assertions  reduce  (like  the  previous  one)  to  Subsections  3 
and  4. 

§  5.  Multiple-order  tensors.  Tensor  product 

I.  Since  a  tensor  product  of  two  spaces  has  been  defined,  we 
thus  have  a  definition  of  a  tensor  product  of  any  number  of  spaces. 
It  suffices  to  multiply  them  together  in  any  order.  If  the  linear  spa¬ 
ces  L i,  L2,  L-i  are  given,  then  the  product  T  —  (L j  ®  L2)  <8>  L3  has 


s  5)  MULTIPLE-ORDER  TENSORS.  TENSOR  PRODUCT  163 

for  its  elements  the  symbolic  sums  of  any  finite  number  of  terms 
of  the  form  (ab)c,  where  a  e  L\,  b  e  L2,  t  e  L3.  The  conditions  of 
equivalence  and,  respectively,  the  admissible  replacements  of  the 
elements  of  T  are  obtained  by  combining  the  conditions  of  equiva¬ 
lence  of  the  elements  of  the  product  L\  <8  /.>  by  L3  and  the  product 
of  L\  by  L2.  For  example, 

((a'  +  a")  b)c  =  (a'b)  c  +  (u"b)  c, 

((an)  b)c  =  (a  ( ab ))  c  —  (ab)  a c 

where  a  is  a  scalar.  Besides  that,  we  require  the  equivalence  con¬ 
dition  of  an  associative  nature: 

(ab)  c  —  a  (be) 

It  signifies  the  identity  (L\  ®  Z,2)  ®  L3  —  L\  ®(L2  ®  L3)  and  per¬ 
mits  writing  abc  instead  of  (ab)c  or  a  (be).  Linear  operations  in  T 
are  defined  together  with  the  product  L\  ®  L2  by  L3.  Also  defined 
is  a  contraction:  a  left  contraction  with  elements  of  L\  and  a 
right  contraction  with  elements  of  L\.  It  is  also  necessary  to  de¬ 
termine  a  contraction  over  the  middle  element  of  each  triad  with 
the  element  u  e  L-,  by  setting 

(aiblcl  +  ...  +akbkck,  u)  =  (bt  u)axcx  +  ...  +  (bk,  u)akck 
This  contraction  is  an  element  of  the  product  L\  ®  L3. 

2.  A  tensor  product  of  any  number  of  spaces  is  determined  by 
induction. 

3.  Suppose  we  have  a  linear  space  L.  Set 

Tpq  =  (L  ®  L  ®  . . .  0  L)  ®  (L*  <&>  L*  ®  . . .  ®  L*) 

where  there  are  p  factors  L  and  q  factors  L*.  Elements  in  Tpq  will 
be  called  tensors  over  the  space  L,  contravariant  of  order  p  and 
covariant  of  order  q.  For  the  sake  of  uniformity  we  also  denote  L 
by  To  and  L  by  T'l  which  is  in  conformity  with  the  condition  by 
which  elements  taken  from  L  and  L*  are  termed  tensors  of  order 
one. 

4.  Since  every  space  Tq  is  linear,  linear  operations  are  defined 
for  tensors  in  each  of  these  spaces.  We  do  not  define  the  addition 
of  tensors  taken  from  different  spaces  Tq]  and  Tq\- 

5.  Besides  linear  operations  on  tensors,  in  each  Tq  we  define 
the  product  of  tensors  taken  from  any,  even  distinct,  spaces  Tq[ 
and 


TENSOR  ALGEBRA 


[CM.  V 


ir.4 


Let  r  <=  rqi\  s  e  Tpq\.  The  product  r  by  s  is  defined  to  be  the 
ordered  pair  rs  regarded  as  an  element  of  the  tensor  product 
Tq\  <S>  Tq‘.  In  general,  rs  sr.  By  Subsection  1  we  have,  for  any 
r,  s,  /, 

(rs)  t  =  r  (, st ) 


Accordingly,  we  obtain  a  product  of  three  tensors:  rst  —(rs)t  = 
=  r(st).  Thus  also  is  defined  the  product  of  tensors  of  any  number 
of  factors. 


6.  Suppose  r  and  s  can  be  represented  as  a  single  term,  that  is, 
in  the  form  of  a  product  of  elements  taken  from  L  and  L *: 


where 

r  —  a  | 

aPlbl  . . .  bq„ 

s==dx 

•  •  •  dPlb\  . . . 

au  . 

•  • »  d| ,  , 

■  •  • ,  ap,  e  L, 

b . 

>  bq, ,  b  |,  . . . 

Then 

rs  =  a. 

i  • •  ■  apP\  • • • 

M,  •  • 

■  dp,b\  . . .  bq. 

If  rs  contains  factors  solely  from  L  or  from  L*  alone,  then  their 
order  is  essential  (by  Subsection  1  of  Section  2).  In  the  general 
case,  let  us  agree  in  the  product  rs  to  write  out  first  all  elements 
of  L  and  then  all  elements  of  L*,  retaining  in  each  case  the  given 
sequence  of  the  factors.  Thus 

rs  =  rZ|  . . .  aPla\  . . .  aPla j  . . .  bq,b\  . . .  bq, 

This  introduces  a  new  condition  for  the  equivalence  of  tensors.  It 
is  generally  accepted  in  most  works  on  tensor  calculus  but  not  in 
all  (see,  for  example,  Sternberg’s  Lectures  on  Differential  Geo¬ 
metry  [23]). 


7.  Set  a  =  a,,  ....  aPl,  assuming  that  ah  ...,  aPl  may  be  ar¬ 
bitrary  elements  of  L.  If  p\  =  0,  then  we  agree  to  put  a  =  1.  We 
will  regard  d,  b  and  5  in  similar  fashion.  Then  the  arbitrary  ten¬ 
sors  r  and  s  (r  e  Tq\,  s  e  Tq‘ )  may  be  symbolized  as 

r=Yiab,  s=Yjb.b 

By  Subsection  1,  we  can  obtain  the  product  rs  by  a  termwise  mul¬ 
tiplication  of  the  first  of  these  sums  by  the  second.  Taking  into 
account  Subsection  6,  we  have 

rs  —  Yj  a&bb 

whence  it  is  clear  that  rser,'^,’. 


MULTIPLE-ORDER  TENSORS.  TENSOR  PRODUCT 


165 


*  5J 


8.  Let  t  e  Tq  with  1  and  q^l.  As  before  we  put  a  —  a,  ...ap, 
b  —  . . .  bq,  and 


From  among  1,  2,  ....  p  we  choose  a  number  /  and  from  among 
1,  2,  . . . ,  q  we  choose  a  number  m.  Denote  by  a'  the  product  of  all 
elements  au  . . . ,  a,,,  with  the  exception  of  a,,  and  by  b'  the  product 
of  all  elements  bi . bq,  except  bm.  We  define  the  inner  contrac¬ 

tion  of  a  tensor  t  over  the  /th  elements  of  L  and  the  /nth  elements 
of  L*  to  be  the  entity 


(t)‘m  =  £  (a/,  bja'b' 

Here,  (at,  bm)  is  the  contraction  of  the  element  at  e  L  with  the 
element  bm  e  L*,  which  is  to  say  it  is  a  number  (distinct  for  every 
term  of  the  sum).  Thus,  the  contraction  (t)lm  is  a  tensor  of  Tpq ij. 
If  p —  1^1  and  q —  1^1,  then  in  turn  we  can  apply  the  con¬ 
traction  operation  to  the  tensor  ( t)‘m  to  obtain  a  tensor  from  TqZ\. 
Both  operations  can  naturally  be  done  at  once.  For  example, 

Wla  =  ((OD2  =  £  («i.  b\)  (°2>  b2)  a*  ...apb3  ...  bq 

If  p  =  q,  it  is  possible  to  exhaust  all  the  orders  of  the  tensor  t 
and  obtain  what  is  called  a  complete  contraction  (a  scalar). 

9.  The  contractions  of  two  tensors  that  we  examined  above  can 
always  be  reduced  to  the  inner  contraction  of  one  tensor.  All  that 
is  needed  is  to  multiply  the  given  tensors  and  perform  an  inner 
contraction  of  their  product.  For  instance,  a  contraction  of  the  ele¬ 
ment  reL  with  the  element  a  e  L*  is  an  inner  contraction  of  the 
product  xu,  the  right  contraction  of  the  tensor  t  e  7V,  with  the  ele¬ 
ment  net'  is  (tufr 

10.  In  concluding  this  section,  we  have  a  number  of  very  im¬ 
portant  remarks  to  make  concerning  operations  on  tensors.  First 
of  all,  because  of  what  was  said  in  Subsection  1,  linear  operations 
in  Tq  are  invariant  to  admissible  replacements  of  elements  in 
Tq.  The  product  rs  e  Tp\tqZ  r(=Tq',  and  serj.  is  invariant  un¬ 
der  admissible  replacements  of  elements  r  and  s  in  Tq\  and  Tqj, 
that  is,  under  such  transformations  it  itself  also  receives  an  ad¬ 
missible  replacement  in  Tq'tq;.  Without  these  properties,  the  defini¬ 
tion  of  linear  operations  on  tensors  and  of  a  tensor  product  would 
be  meaningless. 


Jfif,  TENSOR  ALGEBRA  [CH.  V 

11.  From  what  lias  been  said  in  Subsection  1  of  this  section  and 
Subsection  8  of  Section  2,  it  follows  that  under  an  admissible  re¬ 
placement  of  the  tensor  t  in  Tq  the  contraction  ( t)‘m  also  under¬ 
goes  an  admissible  replacement  in  TpqZ\,  and  so  from  this  we  have 
that  if  t2  —  t\ ,  then  (t.,)lm  = 

Proof.  The  equation  t2  =  t 1  signifies  that  the  tensors  t2  and  t\ 
can,  by  means  of  admissible  replacements,  be  reduced  to  one  and 
the  same  collection  /  of  products  of  elements  taken  from  L  and  L*. 
But  then  (f,)^  and  (t2)‘m  are  reduced,  via  admissible  replacements, 

to  (C 

12.  We  have  not  made  any  use  of  bases  in  our  definitions.  For 
this  reason,  linear  operations  on  tensors  and  also  the  operations 
of  tensor  multiplication  and  inner  contraction  yield  results  that 
are  invariant  in  the  sense  of  independence  of  choice  of  basis. 

In  particular,  the  complete  contraction  of  a  tensor  is  a  numerical 
invariant. 

§  6.  Components  of  multiple-order  tensors 

1.  Let  . . .  be  a  basis  in  L  and  e1,  . . . ,  en  the  reciprocal 

basis  in  L*.  From  Theorem  2,  Section  3,  it  follows  (by  induction) 
that  all  possible  products  of  the  type  . . .  eipe<l  ...  e'o  constitute 
a  basis  in  Tff. 

Thus,  for  any  /eTj  we  have  the  decomposition 

f  T ,Pet  .  . .  6  1  .  . .  6  1 

'l  •  'q  ‘l  lp 

The  numbers  r!1  ”\p  define  the  tensor  t  and  are  called  the  com- 
'>  ■  ’v 

ponents  of  flic  tensor  relative  to  the  basis  eu  ■  ■  ■ ,  e„  of  the  space  L. 
Relative  to  a  given  basis,  they  can  be  specified  arbitrarily,  that  is, 

no  matter  what  numbers  t  1  "  r  are  taken,  a  certain  tensor  is  al- 
way?  defined.  One  often  writes:  /  —  |  r^1  and  states  that 

“a  tensor  {  is  given".  It  is  well  to  bear  in  mind,  however, 

that  the  actual  specification  of  some  concrete  tensor,  even  a  tensor 
of  order  three,  requires  rather  involved  information  in  the  form 
of  tables,  since  the  numerical  values  of  the  components  must  be 
indicated  for  every  combination  of  indices. 


5  6] 


COMPONENTS  OP  MULTIPLE-ORDER  TENSORS 


167 


2.  When  passing  to  a  new  basis,  the  "  \p  transform  by 

*  I  *  *  *  lq 

the  contravariant  law  for  every  upper  index  and  by  the  cova¬ 
riant'  law  for  every  lower  index,  namely, 


rQ  1  ...  QpP  J  ..pi 
V‘|  /,  L 


where  the  summation  is  over  unprimed  indices. 
Proof.  By  (I),  (I*)  of  Section  1, 

er  —  £  P\'ei,  el  —  £  Q \  e 


(•) 


The  reciprocal  formulas  are 

ei=T,Q‘i’ei,  el  =  Y, 

whence 


Ori  the  other  hand, 


v-i  'i  ■■■  ‘p  1 1  lq 

f  =  £v  iei"-ei'e  ■■■e 

I  1  ’  >n  *  I  1  n 


H) 


(2) 


Comparing  (1)  and  (2),  we  get  (•*),  which  is  what  we  wanted. 


3.  We  derived  the  transformation  law  (*)  as  a  consequence  of 
the  invariance  of  tensors  /  e  (we  took  advantage  of  the  inva¬ 
riance  of  t  when  we  compared  (1)  and  (2)).  Contrariwise,  the 
invariance  of  the  tensors  t^Tq  follows  from  (*).  Namely,  due 
to  (»), 


£ 


/ 

1  e 

i 


I 


e 


‘i 


We  will  not  derive  this  equation  from  (»)  but  will  refer  the  reader 
to  Subsection  7,  Section  3,  where  the  essence  of  the  matter  is  de¬ 
monstrated  in  a  special  case. 


4.  Linear  operations  on  tensors  taken  in  some  Tpq  are  expressed 
in  terms  of  components  by  the  ordinary  rules:  in  addition  of  tensors 
the  components  are  added,  in  the  multiplication  of  a  tensor  by  a 
scalar  the  components  are  multiplied  by  that  scalar. 


If, ft 


TENSOR  ALGEBRA 


tCH.  V 


5.  In  the  multiplication  of  a  tensor  r  e  Tq[  by  a  tensor  s^Tq\ 
each  component  of  r  is  multiplied  by  each  component  of  s  and  all 
such  products  constitute  the  components  of  tensor  rs  e  r£,+£. 
For  example,  if  r  —  Yrie‘  and  s  —  Ys/e'  (r.sefl),  then  rs  — 
=  Y  riSiee1  e  Tt  Assuming  rs  =  /  =  Y  ti/e'e1,  we  get  f,/  =  r,S/. 

Remark.  In  general,  rs  =£  sr.  The  inequality  of  rs  and  sr  is 
also  readily  discernible  in  components.  Indeed,  setting  sr  =  f  = 
=  Y  hie‘eJy  we  get  t  —  Y  Sjr^e1  =  Y  ss^e1,  whence  f  „  =  sLrt 
and  tu  ¥=  ta¬ 


li.  The  components  of  the  contraction  ( t)'m  are  obtained  from 
the  components  of  the  tensor  t  by  summing  over  one  upper  and  one 
lower  index,  the  upper  index  occupying  the  Hh  position,  and  the 
lower  index  the  mth  position.  This  is  best  seen  in  a  concrete 
example.  Let 

.  V'  U  a  ft 

t  =  L  a^eieje  ev 

Then,  for  instance, 


(t) g  =  Y  (eh  ep)  aipe.-e*  =  Y  =  Y  (E  ^tek 

Thus,  the  components  (Vi)  are  sums  of  the  form  where  the 

summation  is  taken  over  the  first  upper  index  and  the  second  lower 
index. 


§  7.  Multilinear  forms  and  their  tensors 

1.  Suppose  we  have  an  invariant  numerical  function  a(x\ . xq, 

u\  ...,  «?’)  of  the  vector  arguments  x]t  x  ,et,  . . . , 

u>'  e  L*.  Such  a  function  is  called  a  multilinear  form  if  it  is  linear 
in  each  argument. 

2.  Suppose  in  space  L  we  have  chosen  a  basis  ei,  ....  e„  and 

in  space  L*  a  reciprocal  basis  e' . en.  Then  each  contravariant 

argument  x,(  g  L  of  the  given  form  may  be  decomposed  in  terms 
of  the  basis  et,  .  . . ,  e„: 


Xk  =  Y  x‘kct  —  x\e\  +  x1en 

Similarly,  each  covariant  argument  may  be  decomposed  in  terms 
of  the  basis  e1,  ....  e": 


u 


k 


Zk  :  *  •  ,  .  k  n 

Uie  —u ie  -f  ...  +  une 


MULTILINEAR  FORMS  AND  TIIEIR  TENSORS 


169 


S  7J 

whence 


^  (-^I»  •  i  ,  U  ,  .  .  . ,  Up) 


Thus,  by  the  linearity  of  the  form  we  obtain  its  component  repre¬ 
sentation 

a  (x . .  u' . up)  =  £  a,'  "  \px\'  . .  .  xq  •  «!  . .  .  uf  (1) 

where 

aif  -  lPq~  a(e/,’  •  •  •’  eiq'  e  1 . e  P) 

are  the  coefficients  of  the  right  member  of  (1).  According  to  (2) 
they  are  the  values  of  the  form  on  the  basis  vectors. 


3.  When  passing  to  a  new 


<7  =  Y  P? 


(3) 


and  respectively  in  L*, 


ika‘k 


(4) 


In  the  new  basis,  the  component  representation  of  the  form  will 
have  new  coefficients.  Because  of  the  invariance  of  the  form,  they 
too  are  its  values  on  the  basis  vectors  (which,  naturally,  are  new). 
Thus, 


'i  {  h 

a<\-  i'q~a\ei\' e  ’ 


(5) 


From  (5),  with  account  taken  of  (2),  (3)  and  (4),  we  find 


al 

/. 


(6) 


where  the  summation  on  the  right  is  over  unprimed  indices. 

We  see  that  the  transformation  law  (6)  of  coefficients  of  the  in¬ 
variant  multilinear  form  a(xu  ....  x ,,,  u\  ....  up)  coincides  with 
the  transformation  law  (*)  of  Section  6  of  the  components  of  a 
tensor  in  Tq.  Hence,  to  every  form  a(x t,  ....  u1,  ... ,  up)  is  in- 

variantly  associated  a  tensor  in  Tp: 


Ya1'  ipe 
^ali  •••//'.  •  ■ 


It  is  called  the  tensor  of  the  form. 


(7) 


TENSOR  ALGEBRA 


[CM.  V 


1 70 

Conversely,  to  every  tensor  (7)  is  associated  an  invariant  mul¬ 
tilinear  form  (1).  Note  that  this  form  is  a  complete  contraction  of 
the  product  «.v,  . .  .  x,,u 1  . . .  up. 


§  8.  Symmetrization  and  antisymmetrization  (alternation). 
Skew-symmetric  forms 

1.  Let  us  consider,  relative  to  the  basis  e\ . e„  e  L.  the  mul¬ 

tilinear  form 

a(x,  i/,  z,  . .  . ,  0  =  Z  <*//*...  sx‘y'zk  . . .  f  (1) 

and  the  corresponding  (covariant)  tensor  a,,;,  . . ».  The  form  (1) 
is  said  to  be  symmetric  in  a  given  pair  o/  arguments  if  interchang¬ 
ing  them  does  not  change  the  value  of  the  form.  For  example, 
a(x,  y,  z,  . . . ,  t)  is  symmetric  in  the  first  and  third  arguments  if 

a(x,  y,  z,  .  . . ,  t)  =  a(z,  y,  x . t )  for  any  x,  y,  z,  . .  .  ,  l  <=  L. 

The  symmetry  of  a  form  in  a  given  pair  of  arguments  implies  the 
symmetry  of  its  tensor  in  the  corresponding  pair  of  indices.  In  our 
case  we  have  symmetry  of  the  tensor  in  the  first  and  third  indices: 
Qi jh  . . .  h  —  G/iji ...  s-  Indeed, 

aUk ...  s~  a(er  en  ek'  ■  ■  ■  >  es )  =  a  {ek>  en  ei •  •  •  ■  ’  es)  ~aku  ...  a 
Conversely,  if,  say,  aijh . . .  s  =  ana  . . .  s,  then 
a  (x,  y,  z,  =  2  a,/*  ...  sxiylzk  . . .  ts 

=  Z  akji  sx‘yJzk  ...  ts  =  z  allk  _  ...  f  =  a(z,  y,  x,  . . . ,  t) 

A  multilinear  form  is  said  to  be  symmetric  if  it  is  symmetric  in 
every  pair  of  arguments;  associated  with  a  symmetric  form  is  a 
symmetric  tensor. 

2.  The  form  (1)  is  said  to  be  skew-symmetric  in  a  given  pair  of 
arguments  if  interchanging  them  alters  the  sign  of  the  form.  Say, 
a(x,  y,  z,  ....  /)  is  skew-symmetric  with  respect  to  the  first  and 
third  arguments  if  a(x,  y ,  z,  ...,  /)=  —a(z,  y,  x,  ...,  t)  for  ar¬ 
bitrary  x,  y,  z,  . . . ,  t.  The  tensor  of  such  a  form  is  skew-symmetric 
with  respect  to  the  first  and  third  indices,  that  is,  aijh...  a  = 

==  Of( ji  . . .  s. 

A  multilinear  form  is  said  to  be  skew-symmetric  (or  skew)  if  it 
is  skew-symmetric  with  respect  to  every  pair  of  arguments.  Asso- 
cialed  with  a  skew-symmetric  form  is  a  skew-symmetric  tensor. 

A  skew-symmetric  form  does  not  alter  its  numerical  value  under 
any  even  permutation  of  its  arguments.  Under  any  odd  permuta¬ 
tion  of  its  arguments,  a  skew-symmetric  form  is  multiplied  by 
minus  unity. 


5  8] 


SYMMETRIZATION  AND  ANTISYMMETRIZATION 


171 


3.  Symmetry  or  skew-symmetry  of  a  form  of  covariant  argu¬ 
ments  and,  correspondingly,  of  a  contravariant  tensor  is  defined 
in  complete  analogy  with  the  foregoing.  In  the  case  of  a  mixed 
tensor,  the  properties  of  symmetry  or  skew-symmetry  can  occur 
for  lower  indices  or  for  upper  indices.  But  for  a  pair  of  indices  of 
which  one  is  lower  and  the  other  upper,  these  properties  are  not 
invariant.  For  example,  for  a  tensor  n*  in  some  basis  we  can  have 
the  equation  a*—alk(i,  k  =  \,  2 but  in  general  it  does  not 
hold  true  in  a  change  to  another  basis. 

4.  If  a(x,  y,  z,  . . . ,  t)  is  an  arbitrary  multilinear  form,  then  it 
can  be  associated,  via  a  definite  standard,  with  a  symmetric  form 
of  the  same  arguments.  Namely, 

(a(x,  y,  z,  ....  /))=-^y-£a(x,  y>  z'  0 

where  the  sum  on  the  right  is  taken  over  all  permutations  of  the 
symbols  x,  y,  z,  ... ,  t\  m  is  the  number  of  these  symbols  (the 
number  of  arguments).  This  operation  is  called  symmetrization 
and  is  denoted  by  parentheses  (round  brackets).  In  the  special 
cases  of  in  =  2  and  m  —  3  we  have 

*  , 

(a  (x,  y))  =  ~2f  (a  (x,  y)  +  a  (y,  x)}, 

(a  (x,  y,  z ))  =  ^  {a  (x,  y,  z)  + a  (y,  z,  x) 

+  a  (z,  x,  y)  +  a  (y,  x,  z)  + a  (x,  z,  y)  + a  ( z ,  y,  x)} 

Associated  with  symmetrization  of  a  form  is  symmetrization  of  a 
tensor.  For  instance, 

a(.ip  —  2f  ( a‘i  +  al‘)' 

oaik)  =  -jf  Wak  +  aiki  +  akn  +  allk  +  aikj  -f  ctkii } 

If  the  form  a  is  symmetric,  then  (a)=  a. 

5.  We  have  to  do,  for  example,  with  symmetrization  of  the  tensor 
of  a  multilinear  form  when  the  arguments  of  the  form  are  iden¬ 
tical.  For  instance,  if 

a  (x,  y)  =  JL  «/**V 

is  a  bilinear  form  (in  general  not  symmetric)  and  we  construct 
the  quadratic  form  a(x,  x),  then  we  can  collect  terms  thus: 

aikx‘xk  +  akixkxl  =  (atk  +  aki)  x‘xk  =  2aUk)xixli 


17'-> 


TENSOR  ALGEBRA 


[CH.  V 


The  coefficients  of  the  resulting  quadratic  form  are  the  numbers 
«(,/,);  they  constitute  its  (symmetric)  matrix.  The  polar  form  of 
a(x,  x)  is  the  form  ( a(x ,  y)).  Similarly,  a(ijh)  are  called  the  coeffi¬ 
cients  of  the  cubic  form  a(x,  x,  x)  obtained  via  identification  of 
the  arguments  of  the  trilinear  form  a( x,  y,  z). 

6.  The  operation  of  antisymmetrization  (alternation)  consists  in 
the  following:  an  arbitrary  multilinear  form  a(x,  y,  z,  . . . ,  t)  is 
associated,  via  a  definite  standard  procedure,  with  a  skew-sym¬ 
metric  form  of  those  same  arguments.  It  is  denoted  by  square 
brackets  and  is  defined  by  the  equation 

[a(x,y,z,...,t)\  =  -^{Yj^a(x,y,z,...t  0  —  (*.  i/,  z, 

where  the  first  sum  is  taken  over  all  even  permutations  of  the 
symbols  x,  y,  z,  ....  t,  and  the  second  over  all  odd  permutations. 
For  instance, 

[a  (x,  y)]  =  -^-{a  (x,  y)  —  a  (y,  *)}, 

la  (.v,  y,  z)\  =  -jp  {a  (*,  y,  z)  +  a  (y,  z,  x) 

+  a  (z,  x,  y)  —  a  (y,  x,z)  —  a  ( x ,  z,  y)  —  a  (z,  y,  *)} 

Accordingly,  we  have  the  operation  of  antisymmetrization  (alter¬ 
nation)  of  a  tensor: 

am\  =  -^(aU-alt), 

a\iik I  =  -3)-  (fli/fc  +  <*lki  +  ak  1/  ~  aHk  ~  aikl  —  akli) 


If  the  form  a  is  skew  (skew-symmetric  with  respect  to  all  argu¬ 
ments),  then  l«]  =  a. 

7.  We  note  two  simple  properties  of  the  set  of  skew-symmetric 
forms: 

(1)  The  skew-symmetric  form  vanishes  even  if  only  two  argu¬ 
ments  are  identical. 

Indeed,  suppose,  say,  the  first  two  arguments  of  the  form  coin¬ 
cide.  Then  a(y,  y,  z,  . . . ,  t)=—a(y,  y,  z . t).  Hence, 

2a(y,  y,  z . /)=  0. 

(2)  If  the  number  of  arguments  of  a  skew-symmetric  form  ex¬ 
ceeds  the  dimensionality  of  the  space,  then  the  form  is  identically 
zero. 


SVMMETRtZATtON  AND  ANTISYMMETRIZATION 


173 


§ * 


Indeed,  in  this  case  the  arguments  are  linearly  related  and  hence 
one  of  them  is  linearly  expressible  in  terms  of  the  others.  Suppose, 
for  example,  x  =  ay  +  fte  -f  . . .  -f-  U.  Then 

a(x,  y,z . t)  =  aa  ( y ,  y,  z . 0  +  (*,  U<  z . 0  + 

...  +  Xa  (t,  y,  z . t) 

and  by  the  first  property  a  (y,  y,  z,  ....  /)=  0,  . . . ,  a  (t,  y,z,  . . . 
. . . ,  t)  =  0,  whence  a(x,  y,z . /)  —  0. 


8.  We  now  consider  skew  forms  in  which  the  number  of  argu¬ 
ments  is  equal  to  the  dimension  of  the  space. 

Let  a(x,  y,  . . . ,  t)  be  an  arbitrary  skew  form  of  n  arguments 
x,  y,  . . . ,  i  belonging  to  an  n-dimensional  space  L.  Fix  an  arbitrary 

basis  e\ . e„  in  L  and  decompose  the  arguments  of  the  form 

relative  to  this  basis.  By  Subsection  2,  Section  7,  we  have 

a  (x,  y,  . . . ,  0  =  Z  a.y 2 ...  •  •  •  t'n  (2) 

where 

a,y3...  in  =  a(eir  ei} . e,n) 

From  the  definition  of  a  skew  form  and  from  property  (1)  of  the 
preceding  subsection, 

a(e.,,  e <j(  . . .,  ein)  =  ...  in  ‘a(ei>  e2<  •  •  ••  en ) 

that  is, 

&l  t  1  =^i  1  i  2  n  (^) 

*1*2  ln  '  1  ‘2  •••  ln  1  A---n 

Substituting  (3)  into  (2),  we  see  that  the  form  at  hand  can  be 
represented  relative  to  the  given  basis  as 

a  (.v,  «/,...,  t)  =  al2  n  •  A  (4) 


where  A  is  a  determinant  made  up  of  the  components  of  the  argu¬ 
ments: 


v’.v2  . 

..  xn 

.  /'»  = 

u'y2  • 

•  ■  y'1 

t't*  . 

..  tn 

We  will  call  the  coefficient  a  1 2 . . .  n  the  principal  coefficient  of 
the  skew  form  (2).  All  the  other  coefficients  of  the  form  are  either 
zero  or  equal  to  ±  a\  2 . . .  „  (in  accordance  with  formula  (3)). 

If  ai2...n  =0,  then  the  skew  form  a(x,  y . /)  is  identically 

zero. 


174 


TENSOR  ALGEBRA 


[CH.  V 


If  o.\2...n  ¥=  0,  llien  a(x,  y ,  t)  is  different  from  zero  when 
the  arguments  are  linearly  independent. 

From  formulas  (4)  and  (5)  it  follows  that,  up  to  a  numerical 
factor,  there  is  only  one  skew  form  of  n  arguments  in  the  space  L. 

Indeed,  if  the  given  form  a(x,  y,  . . . ,  t)  is  not  identically  zero, 
and  b(x,  y,  . . . ,  t)  is  any  other  skew  form  of  n  arguments,  then 

b  (x,  y,  . . . ,  0  =  b{ ,  _  n  •  A  =  0  •  a  (x,  y,  . . . ,  t) 

where 


Note  that  the  reasoning  in  this  section  is  carried  through  relative 
to  a  given  basis  and  only  the  skew  symmetry  of  the  forms  is  uti¬ 
lized  and  not  their  invariance.  We  will  take  advantage  of  this  fact 
in  the  next  chapter  where  we  consider  skew-symmetric  multilinear 
functions  whose  numerical  value  is  not  invariant  with  respect  to 
a  change  of  basis. 

9.  A  more  detailed  study  of  skew-symmetric  forms  and  skew- 
symmetric  tensors  is  made  in  Chapter  X. 

§  9.  An  alternative  description  of  the  tensor  product 
of  two  linear  spaces 

1.  At  the  beginning  of  this  chapter,  in  Section  2,  we  defined  a 
tensor  product  T  —  L  ®  L  as  a  set  whose  elements  are  any  finite 
collections  of  pairs  consisting  of  elements  of  L  and  L.  Thus,  if 
certain  specific  entities  are  chosen  as  elements  of  L  and  £,  then 
the  elements  of  L  ®  £  are  quite  concrete  (collections  of  pairs  of 
these  entities).  In  order  to  make  £  ®  £  a  linear  space  we  had  to 
include  a  description  of  admissible  replacements  and  linear  opera¬ 
tions  for  the  elements  of  the  set  L  <S>  L.  Since  we  had  to  construct 
the  linear  space  L  ®  £,  the  description  of  a  tensor  product  ap¬ 
peared  to  be  rather  unwieldy. 

We  now  give  a  definition  of  the  tensor  product  L  <S>  L  that  is 
quite  independent  of  the  one  previously  described.  It  will  be  more 
economical  in  the  sense  that  for  L  <S>  L  we  will  from  the  start  take 
a  certain  linear  space  T  (the  meaning  of  the  equation  T  =  L  <8>  L 
will  consist  in  establishing  only  certain  interrelationships  between 
the  elements  of  L,  L  and  T).  Unfortunately,  the  new  definition  will 
have  some  defects  of  its  own.  The  point  is  that  there  will  be  a  good 
deal  of  arbitrariness  in  the  new  construction  of  the  tensor  product 
L  <8>  /'  (unlike  the  old  construction,  where  the  elements  of  L  ®  L 
are  completely  determined  by  the  elements  of  L  and  £).  For  this 


§  9|  TENSOR  PRODUCT  (AN  ALTERNATIVE)  175 

reason,  first  of  all,  given  £  and  £,  we  define  not  one  tensor  product 
L  ®  £  but  a  set  of  distinct  products.  But  then  we  define  a  certain 
natural,  notion  of  the  isomorphism  of  these  tensor  products.  They 
will  turn  out  to  be  isomorphic  (equivalent)  to  one  another  and 
also  to  the  tensor  product  L  by  £  in  the  meaning  of  our  old  con¬ 
struction  in  Section  2.  Of  course,  this  will  require  a  certain  amount 
of  time  and  energy  with  the  result  that  there  will  probably  be  no 
saving  after  all. 

2.  We  give  this  alternative  description  of  a  tensor  product  with 
the  aim,  as  far  as  this  is  possible,  of  helping  the  reader  to  clarify 
the  following  problem. 

Let  it  be  said  that  a  linear  space  T  is  the  tensor  product  of  a 
linear  space  L  and  a  linear  space  £.  We  confine  ourselves  here  to 
the  finite-dimensional  case.  Then  the  dimension  of  T  is  equal  to  the 
product  of  the  dimensions  of  £  and  £.  Naturally,  by  itself  this  re¬ 
lation  of  dimensions  does  not  suffice  to  characterize  T  as  the 
tensor  product  £  ®  £.  The  point  is  that  in  specific  instances  we 
may  not  be  able  to  perceive  the  construction  of  a  tensor  product 
as  described  in  Section  2.  What  is  more,  all  three  spaces,  £,  £ 
and  7\*may  be  given  to  within  linear  isomorphisms,  and  then  the 
symbolic  sums  described  in  Section  2  are  replaced  by  elements  of 
quite  a  different  nature.  Therefore,  an  answer  to  the  question  of 
what  it  means  that  T  is  the  product  £  ®  £  actually  cannot  be 
given  in  such  a  general  situation  on  the  basis  of  Section  2.  This 
requires  giving  the  very  definition  of  a  tensor  product  in  a  more 
general  form,  which  is  precisely  what  we  now  intend  to  do. 

3.  Suppose  we  have  linear  spaces  £  and  £  of  dimension  n  and  m 
respectively.  Let  there  also  be  a  linear  space  T  whose  dimension 
is  equal  to  the  product  nm.  The  spaces  £,  £,  T  are  assumed  to  be 
all  real  or  all  complex. 

Further,  let  there  be  given  a  certain  map  f  of  the  two  spaces  £, 
£  into  the  space  T.  This  means  that  there  is  an  element  iel  as¬ 
sociated  with  an  arbitrary  pair  of  elements  a,  a,  where  a  e  £, 
a  e  £.  For  the  sake  of  simplicity  we  write 

t  —  aa  (1) 

thus  identifying  the  pair  ad  with  its  image  t  =  f  (a,  a)  in  the 
space  T.  Let  us  agree  from  now  on  to  write  the  element  of  £  first. 
If  £  coincides  with  £,  then  the  pair  in  the  right  member  of  (1) 
will  be  taken  to  be  ordered.  Thus,  in  general,  ad  =/=  da,  that  is,  the 
elements  which  in  T  correspond  to  the  pairs  ad  and  da  by  virtue  of 
the  map  /  do  not  necessarily  coincide. 

^e  assume  that  the  following  properties  of  the  map  /  hold  true. 


I7(i 


TENSOR  ALGEBRA 


[CH.  V 


(1)  Distributive  property  (with  respect  to  each  clement  of  a 
pair): 

(a  +  b)  a  =  ad  -j-  ba  (2) 

a  (a  -f  b)  =  ad  -+-  ab  (3) 

for  any  a,  b  e  L,  a,  b  e  £. 

(2)  Associative  property: 

(aa)  a  =  a  (ad)  =  a  (ad)  (4) 

for  any  a  e  £,  a  e  £  and  for  any  scalar  a  (real  or  complex,  de¬ 
pending  on  whether  the  spaces  £,  £,  T  are  real  or  complex). 

By  virtue  of  the  properties  (2),  (3),  (4),  there  are  sufficient 
grounds  to  use  the  word  “product”  in  place  of  “pair”.  Accordingly, 
equation  (1)  is  to  be  read  thus:  element  t  of  space  T  is  the  pro¬ 
duct  of  element  a  of  L  by  element  a  of  £. 

Remark.  Now,  equations  (2),  (3)  and  (4),  unlike  the  similar 
equations  of  Section  2,  express  the  properties  of  the  map  /  and 
not  the  conditions  of  admissible  replacements  in  T.  The  point  is 
that  the  admissible  replacements  have  already  been  given  in  T 
beforehand  together  with  the  definition  of  T  as  a  linear  space. 

(3)  The  property  of  nonsingularity  of  the  map  f:  if  the  elements 

a[,  ....  a„  are  linearly  independent  in  L  and  the  elements 
d\,  ...,  am  are  linearly  independent  in  L,  then  the  system  Oidi, 
(i  —  1,2 . n;  k  =  1,  2 . m)  is  linearly  independent  in  T. 

By  the  foregoing,  some  of  the  elements  of  T  are  products  of  ele¬ 
ments  of  the  spaces  L  and  L.  For  example,  due  to  (4)  the  zero  ele¬ 
ment  in  T  is  the  product  of  the  zero  element  of  L  into  any  element 
of  £,  or  of  any  element  of  L  into  the  zero  element  of  £.  However, 
not  every  element  of  £  is  a  product  of  some  element  of  £  by  some 
element  of  £.  Also,  it  is  easy  to  prove  the  following  assertion: 
every  element  t  is  a  linear  combination  of  products  of  elements 
taken  from  L  and  £. 

Proof.  Let  e\  ...,en  be  a  basis  in  L  and  e\,...,em  a  basis  in  £. 

Then,  due  to  condition  (3),  the  system  of  all  product  pairs  is 

a  basis  in  T  (since  the  dimension  of  T  is  equal  to  nm) .  Hence,  any 
element  l  e  T  can  be  represented  as 

t  =  Z  iiketek  (5) 

where  i  =  1,2,  . ..,  n;  6  =  1,2,  . ..,  m.  The  proof  is  complete. 

Remark.  If  we  put  =  ak  and  ek  =  ak,  then  (5)  takes 

the  form 

t  =  X  akak  (6) 

Thus  every  element  t  e  T  can  be  represented  as  a  sum  of  products 
of  elements  taken  from  £  and  £, 


TENSOR  PRODUCT  (AN  ALTERNATIVE) 


177 


§  91 

4.  Definition.  A  linear  space  T  of  dimension  nm  regarded  to¬ 
gether  with  a  given  map  /  into  T  of  a  pair  of  linear  spaces  £,  £  of 
dimensions  n  and  in,  respectively,  is  called  a  tensor  product  of  L 
by  £  if  f  satisfies  the  conditions  (1),  (2),  (3)  of  Subsection  2. 
Symbolically,  T  —  L  <8>  £. 

The  elements  of  the  space  T  regarded  as  linear  combinations  o) 
the  products  of  elements  taken  from  L  and  £ ,  that  is,  as  (5)  or 
(6),  are  termed  tensors  over  L  and  £.  The  numbers  tih  in  (5)  are 
called  the  components  of  the  tensor  t  relative  to  the  basis  (or 
relative  to  the  bases  and  eh  of  the  spaces  L  and  £). 

5.  We  now  show  how  it  is  possible  to  construct  a  map  f  with 
the  properties  (1),  (2),  (3)  of  Subsection  2.  At  the  same  time  we 
will  determine  the  degree  of  arbitrariness  in  this  construction. 

Let  us  first  assume  that  the  map  f  is  already  given.  Let  et  and 
en  be  arbitrary  bases  in  L  and  £.  Then  by  property  (3)  the  product 
pairs  e,eh  defined  by  the  map  f  constitute  a  basis  in  T.  Suppose 
we  know  that  e ,•  e  £,  eh  e  £  and  e  T.  Then  we  know  the 
map  f  completely,  that  is,  for  any  x  e  £  and  fe£  we  know  the 
product  xx  in  the  space  T.  Indeed, 

x  =  £  x‘eh  x  =  £  xkek  (7) 

whence  and  due  to  properties  (1)  and  (2) 

xx  =  2  x‘et  X  xkek  —  X  xixkeie  (8) 

Thus,  if  the  map  /  exists,  then  it  is  uniquely  defined  by  specifi¬ 
cation  of  arbitrary  bases  ef  and  eu.  in  £  and  £  and  by  specification 
of  the  basis  eih  in  T  whose  elements  are  pairs  of  the  products  e{ 
by  eh,  that  is,  e^  =  (these  equations  are  to  be  understood  in 
the  sense  that  eih  is  an  image  of  the  pair  eit  eu  precisely  under  the 
map  /). 

But  it  is  easy  to  see  that  the  desired  map  f  will  be  found  from 
these  very  same  conditions.  Indeed,  let  e,  e  £,  eh  e  £  and 
eih  e  T(i  =  1,  . . . ,  n\  6=1,  ....  m).  We  regard  each  of  these 
systems  to  be  linearly  independent  in  its  space.  We  indicate  eih 
for  the  images  of  the  pairs  eit  e*  with  respect  to  the  desired  map  f, 
i.e.,  we  put  =  eih.  These  equations  can  be  ensured  since  the 
number  of  all  e,-  is  equal  to  n  and  the  number  of  all  <?/,  is  equal 
to  m,  while  the  number  of  all  e,h  is  equal  to  nm.  Then  we  deter¬ 
mine  f  from  (8)  on  an  arbitrary  pair  x,  x,  where  x  and  x  are  given 
by  (7). 

The  properties  (1)  and  (2),  Subsection  3,  are  readily  verified 
for  the  map  f  thus  constructed.  Let  us  verify,  say,  the  identity  (2). 

Let  y  =  X  l/ei-  Then 

{x  +  ij)  x  =  £  (.v‘  +  y‘)  xkeiek  =  X  x'^efik  +  X  y'^e^k  =  xx-\r  yx 


178 


TENSOR  ALGEBRA 


[CM.  V 


It  is  a  bit  more  difficult  to  verify  the  third  property  (the  nonsin¬ 
gularity  of  the  constructed  map  f).  Let  us  take  any  new  pair  of 
bases  a,  and  ah  in  L  and  L.  We  have  to  demonstrate  that  the  pro¬ 
duct  pairs  a, aft  (that  is,  the  images  of  the  pairs  a,-,  ah  in  T)  are 
linearly  independent.  We  have 

ai'  ~  £  Pi,ei>  ^As' =  £ 

whence 

(9) 

We  see  that  the  vectors  ciitdk,  are  linearly  expressible  in  terms  of 
e* <?/,.  Hence  the  rank  of  the  system  ardk,  in  the  space  T  does  not 
exceed  the  rank  of  the  system  e 

But  from  (9)  we  get 

=  £  Q;i'Qkai^k-  (10) 

Here  the  quantities  Q)  are  defined  in  standard  fashion  from 
P‘r  (see  Section  1).  Qt  are  defined  from  P*'  in  similar  fashion. 
From  (10)  we  conclude  that  the  rank  of  the  system  e,A  does  not 
exceed  the  rank  of  the  system  ai,ak,.  Hence  the  ranks  of  these 
systems  are  equal.  And  since,  by  hypothesis,  the  system  e,A  is  in¬ 
dependent  in  T,  so  also  is  the  system  arak,  independent  (because 
it  has  the  same  rank,  which  is  equal  to  the  total  number  of  vec¬ 
tors). 

To  summarize,  then,  we  have  proved  the  existence  of  the  maps 
we  need  and  have  completely  elucidated  the  arbitrariness  of  their 
construction. 

6.  Returning  to  the  definition  in  Subsection  3,  we  conclude  that 
we  have  defined  L  <2>  L  with  exactly  the  same  arbitrariness  as 
there  is  in  the  choice  of  the  map  /. 

7.  We  denote  by  tp  an  arbitrary  one-to-one  mapping  F  =  ep(0 
of  the  space  T  onto  itself,  this  mapping  being  a  linear  isomorphism 
of  the  space  (see  Section  10,  Chapter  I).  We  denote  by  {'  a  compo¬ 
sition  of  the  maps  f  and  cp;  symbolically,  f'  =  tp ).  This  equation  is 
to  be  understood  as  follows:  first  f  carries  an  arbitrary  pair  a,  a 
(a  e  L,  d  e  L)  into  element  t  of  space  T  and  then  <p  carries  t  into 
/'  =  cp(0. 

Definition.  The  tensor  products  of  L  by  L  that  have  been  estab¬ 
lished  by  means  of  the  maps  f  and  f'  will  be  called  isomorphic  if 
y  =  ((>/",  where  tp  is  some  linear  isomorphism  of  T  onto  itself.  The 
tensors  t  and  l'  will  be  said  to  be  corresponding  under  the  given 
isomorphism  of  tensor  products  if  t'  =  (p(t). 


§  9)  TENSOR  PRODUCT  (AN  ALTERNATIVE)  170 

In  accordance  with  this  definition,  all  the  tensors  constructed 
by  means  of  /  are  mapped  into  tensors  constructed  with  the  aid 
of  f'  —  (pf.  In  order  to  explain  why  a  consideration  of  tensors  con¬ 
structed  by  f  is  equivalent  to  a  consideration  of  their  images  un¬ 
der  an  isomorphism,  let  us  examine  the  situation  from  the  arith¬ 
metical  point  of  view;  that  is,  we  will  examine  the  tensors  in  terms 
of  components. 

Let  \eteky  =  <p  foe*),  t  =  Then 

/'  =  £*'*(*,**)'  (11) 

where  tih  are  the  very  same  numbers  as  in  (5).  Thus,  under  an 
isomorphism,  the  corresponding  tensors  have  the  same  components 
relative  to  the  appropriate  bases.  Thus,  under  an  isomorphism,  the 
only  thing  that  changes  is  the  representation  of  tensors  in  the 
form  of  certain  elements  of  the  space  T.  But  the  tensor  compo¬ 
nents,  and,  hence,  all  equations  referring  to  them  in  any  kind  of 
problem,  remain  unaltered. 

8.  Finally,  we  offer  the  following  proposition,  the  proof  of  which 
we  leave  to  the  reader:  if  f  and  f'  are  two  maps  of  a  pair  of  spaces 
L  and  L  into  the  space  T  satisfying  the  conditions  (1),  (2),  (3)  of 
Subsection  3,  then  there  is  an  isomorphism  (p  of  space  T  onto  it¬ 
self  such  that  f'  =  q if. 

From  this  follows  the 

Theorem.  All  tensor  products  of  a  given  linear  space  L  into  a 
given  linear  space  L  are  isomorphic  to  one  another. 


Chapter  VI 


GROUPS  AND  SOME  APPLICATIONS 


§  1.  Groups  and  subgroups.  Distribution  of  bases  into  classes  with 
respect  to  a  given  subgroup  of  matrices.  Orientation 

I.  Given  a  set  G  for  the  elements  of  which  an  equality  (or  ad¬ 
missible  replacement,  see  Section  1,  Chapter  1)  has  been  estab¬ 
lished;  also  given  is  an  operation  called  multiplication.  This  ope¬ 
ration  associates  to  every  pair  of  elements  a,  b  in  G,  taken  in  a 
specific  order,  an  element  c  of  that  set.  Symbolically  we  write 
c  =  ab  and  say  that  c  is  the  product  of  a  by  b.  It  is  assumed  that 
the  product  ab  is  invariant  to  admissible  replacements  of  the  fac¬ 
tors  a  and  b. 

Definition.  A  set  G  together  with  the  operation  of  multiplication 
specified  in  it  is  said  to  be  a  group  (relative  to  this  operation)  if 
the  following  axioms  hold  true. 

(1)  For  any  a,  b,  cg  G 


(ab)  c  —  a  (be) 


(2)  There  exists  an  element  e  e  G  such  that  for  any  a  ^  G  we 
have 


ae  =  a 


The  element  e  is  called  the  unit  of  the  group. 

(3)  For  any  a  e  G  there  exists  an  rsC  such  that 

ax  —  e 


This  element  is  called  the  inverse  of  a  and  is  denoted  by  a-1. 

2.  The  following  propositions  follow  readily  from  Axioms  (1), 
(2),  (3). 

(a)  If  ax  —  i\  then  xa  =  e. 

Proof.  By  Axiom  3  there  exists  a  y^G  such  that  xy  =  e.  On 
the  other  hand,  if  ax  —  e,  then  a  =  ae  —  a(xy)  —  (ax)y  —  ey, 
whence  a  =  ey.  Hence,  xa  =  x(ey)  =  xy  =  e. 


SUBGROUPS.  ORIENTATION 


181 


(b)  ea  =  a  for  any  a  e  G. 

Proof.  By  Axiom  3  and  from  what  has  been  proved  there  exists 
an  x  such  that  ax  —  e  and  xa  —  e.  Thus, 


ea  =  (ax)  a  =  a  (xa)  =  ae  =  a 


(c)  If  ax  =  e,  ay  =  e,  then  y  =  x. 

Proof.  We  have  y  —  ye  =  y(ax)  —  (ya)x  =  ex  =  x 

The  foregoing  theorems  signify  that  there  is  no  necessity,  in  a 
group,  to  distinguish  between  left  and  right  inverses  and  also 
between  left  and  right  units.  Besides,  in  a  group  there  is  always 
uniquely  defined  an  operation  inverse  to  group  multiplication;  na¬ 
mely,  the  equation  ax  =  b  has  the  unique  solution  x  =  a~'b  and 
the  equation  xa  =  b  has  the  unique  solution  x  —  ba~l.  This  means, 
finally,  that  every  group  has  only  one  unit.  Indeed,  if  ae  —  a  and 
ae*  =  a,  then,  as  has  been  proved,  e*  =  e. 

3.  An  important  instance  of  a  group  is  the  set  of  all  nonsingular 
n  X  n  matrices  (either  real  or  complex)  together  with  the  multi¬ 
plication  operation  defined  in  Section  2  of  Chapter  II.  In  the  group 
of  nonsingular  nX"  matrices  the  unit  is  the  unit  matrix  E.  The 
inverse  of  this  nonsingular  matrix  is  constructed  in  accordance 
with  Subsections  4  to  8  of  Section  3,  Chapter  II.  We  leave  it  to 
the  reader  to  verify  the  first  axiom  of  the  group  for  multiplication 
of  matrices  (that  is,  associativity:  (AB)C=  ,4  (SC)). 

The  matrix  example  shows  that  group  multiplication  is,  in  ge¬ 
neral,  not  commutative  (see  Subsection  3,  Section  2,  Chapter  II). 

4.  A  group  is  said  to  be  commutative  or  Abelian  if  for  any  ele¬ 
ments  a,  b  in  the  group  we  have  ab  —  ba.  Incidentally,  in  this 
case  the  group  operation  is  frequently  called  addition  and  in  place 
of  ab  we  write  a  -f-  b.  Then  the  unit  of  the  Abelian  group  is  called 
the  zero  element. 

Examples.  (1)  Every  linear  space  is  an  Abelian  group  under 
the  operation  of  addition  of  elements.  This  is  clear  since  the  first 
four  axioms  of  a  linear  space  coincide  precisely  with  the  three 
axioms  of  a  group  with  the  supplementary  condition  of  commuta¬ 
tivity. 

(2)  The  set  of  all  real  numbers  different  from  zero  forms  a  com¬ 
mutative  group  under  the  operation  of  ordinary  multiplication.  The 
unit  of  this  group  is  the  number  one,  the  inverse  of  A  is  the  num¬ 
ber  X-1. 

5.  Definition.  A  subset  G  of  elements  of  a  group  is  called  a 
subgroup  if  aeC,  b  s  G  implies  ab  e  G,  and  aefl  implies 
a~'  e  G. 


182 


GROUPS  AND  SOME  APPLICATIONS 


(CH.  VI 


From  this  it  follows,  in  particular,  that  seS.  Thus,  under  these 
conditions  the  requirements  of  Axioms  1  to  3,  Subsection  1,  hold 
for  G,  and  the  subset  G  itself  is  a  group  under  the  same  operation 
of  multiplication  as  is  given  in  the  whole  group  G.  The  unit  of  the 
group  G  is  the  unit  of  every  subgroup. 

6.  Examples  of  subgroups.  (1)  In  an  arbitrary  group  G  the  unit  e 
constitutes  a  subgroup  consisting  of  a  single  element. 

(2)  The  entire  group  G  may  be  viewed  as  a  subgroup  of  G. 

(3)  If  a  linear  space  L  is  viewed  as  a  group  under  addition, 
then  any  subspace  of  L  is  a  subgroup.  We  suggest  that  the  reader 
construct  a  subgroup  in  L  that  is  not  a  subspace. 

(4)  In  the  group  of  real  numbers  different  from  zero  (see 
Example  2  of  Subsection  4)  all  positive  numbers  form  a  subgroup. 

(5)  In  the  same  group  there  is  another  subgroup  consisting  of 
two  elements:  the  numbers  }.  —  1  and  X  —  — 1. 

(6)  In  the  group  of  all  real  nonsingular  nXn  matrices  let  us 
consider  the  subset  G  consisting  of  matrices  with  a  positive  deter¬ 
minant.  From  the  theorem  on  the  determinant  of  a  product  of 
matrices  (Chapter  II,  Section  3)  it  follows  that  5  is  a  subgroup. 

Indeed,  if  A,  B  e  G,  then  det  AB  —  det  A -det  B  >  0,  hence 
e  3.  If  then  det  /4_l=(det  A)-1  >  0  and,  hence, 

A <=  G. 

(7)  In  the  group  of  all  real  (or  complex)  nonsingular  n  x  n 
matrices  we  consider  the  subset  G  consisting  of  all  matrices  whose 
determinants  have  absolute  value  =  1.  It  is  easy  to  see  that  G  is 
a  subgroup.  Indeed,  if  A,  B  e  G,  then  |  det  AB  |  =  |  det  A  |  •  |  det  B  |  = 
=  1,  hence  /IBeC;  if  /le3,  then  |  det  A~l  |  =  |  det  A  |-'  =  I, 
hence  A~'  e  G. 

7.  Let  a  subgroup  G  be  taken  in  the  group  of  all  nonsingular 
nXn  matrices.  We  consider  the  n-dimensional  linear  space  L 
(real  if  G  consists  of  real  matrices,  and  complex  if  the  matrices 
are  complex).  In  L  take  a  basis  eu  ....  en  and  pass  to  a  different 
basis 

er  =  lP,rei  (1) 

provided  that  the  coefficients  P\>  constitute  a  matrix  P  of  the  sub¬ 
group  G. 

We  condense  (1)  to 

e'  =  Pe  (la) 

regarding  (la)  as  a  matrix  equation  in  which  the  elements  of  the 
column  matrices  e  and  e’  are  vectors  and  the  elements  of  the 


SUBGROUPS.  ORIENTATION 


183 


§  I) 

square  matrix  P  are  scalars: 


«1 

<V 

p\’  ... 

Pr 

• 

,  e'  = 

• 

,  p= 

.  .  . 

.  . 

ea 

V 

P'n'  .  • 

Pn- 

By  taking  for  P  all  possible  matrices  in  G,  we  get  a  diversity  of 
bases  e'  that  form  a  certain  class  of  bases,  which  class  we  denote 
by  S(e).  We  will  say  that  the  class  S' (e)  is  generated  by  the 
basis  e  with  respect  to  the  given  subgroup  G. 

8.  From  the  fact  that  G  is  a  subgroup  there  follows  an  important 
theorem. 

Theorem.  If  some  basis  e'  belongs  to  S(e),  then  the  class  ge¬ 
nerated  by  the  basis  e'  coincides  with  S(e).  Symbolically:  S{e')  — 
=  S(e). 

Proof.  Let  e"  be  an  arbitrary  basis.  Suppose  that  e"  ^S(e'). 
This  means  that  there  exists  a  matrix  P'  e  G  for  which  e"  =  P'e'. 
On  the  other  hand,  e'^S(e).  Accordingly,  we  have  the  matrix 
PeC  for  which  e'  =  Pe.  Whence  e"  —  (P'P)e.  But  since  G  is  a 
subgroup  and  since  PeG,P'eG,  it  follows  that  P'P  e  G.  Hence 
e"  ^S(e).  Thus,  every  basis  in  S (e')  enters  into  S(e),  that  is, 
the  class  S’(e')  is  contained  in  the  class  S(e). 

Now  note  that  in  the  case  e'  —  Pe,  P  e  G,  it  is  true  that 
e  —  P~le'  with  P~l .e  G  (since  G  is  a  subgroup).  In  other  words, 
if  e'  e^(e),  then  e  <=  S (e') .  Hence,  in  the  preceding  argument  we 
can  interchange  S(e)  and  S(e').  Therefore,  the  class  S’ (e)  is  in¬ 
cluded  in  the  class  S(e')  \  hence  S(e')  =  S (e)  and  the  theorem  is 
proved. 

Remark.  Since  the  choice  of  basis  that  generates  a  class  is  im¬ 
material  within  the  framework  of  that  class,  we  will  henceforth 
write  S  in  place  of  S(e). 

9.  From  what  was  proved  in  Subsection  8  it  follows  that  the  set 
of  all  bases  in  L  breaks  up  into  classes  with  respect  to  a  given 
subgroup  G  so  that  each  basis  is  accommodated  by  exactly  one 
class  (two  classes  either  do  not  have  common  bases  or  are  com¬ 
pletely  coincident).  Each  class  S  is  invariant  with  respect  to  a 
given  subgroup  G.  This  means  that  for  any  basis  e  e  S  and  for 
any  matrix  P  e  G,  e'  =  Pe  e  S.  (This  means  that  after  a  trans¬ 
formation  by  means  of  any  matrix  of  the  subgroup  G,  every  basis 
of  the  class  S  remains  in  that  class.) 

10.  Example.  Let  L  be  a  Euclidean  plane  (more  precisely,  a  li¬ 
near  space  of  vectors  lying  in  that  plane)  and  G  a  subgroup  of 


la-i 


UROUPS  AND  SOME  APPLICATIONS 


ICH.  VI 


second-order  real  matrices  whose  determinants  are  equal  in  abso¬ 
lute  value  to  1.  Then  each  class  8  consists  of  bases  with  one  and 
the  same  area  of  the  basis  parallelogram  (Fig.  28)  (different  clas¬ 
ses  are  associated  with  different  values  of  this  area).  Indeed,  let 

<V  =  ae\  +  P?2.  <Y  =  ve,  +  6?2  (2) 

Denote  by  S  the  area  of  the  basis  parallelogram  for  eit  e<i,  by  S' 
a  similar  area  for  ey,  er ■  From  (2)  we  have 

S'==S|«6-PyI 

If  the  matrix  P  of  the  transformation  (2)  belongs  to  G,  then 
|a6  —  PyI  =  *  and  S'  =  5.  Conversely,  if  S'  =  S,  then  PeC. 

II.  Let  L  again  denote  an  n-dimensional  linear  space.  We  as¬ 
sume  it  to  be  real.  Now  denote  by  G  the  subgroup  consisting  of 
all  n  X  n  matrices  with  positive  determinants. 


Fig.  28 


In  L  take  an  arbitrary  basis  e  and  construct  a  class  S’(e)  with 
respect  to  the  subgroup  G.  Henceforth  we  denote  this  class  by  <S . 
It  is  obvious  that  the  class  <S  does  not  exhaust  all  the  bases  of  the 
space  L. 

Indeed,  if  P  is  an  n  X  n  matrix  with  negative  determinant,  then 
the  basis  <*'  —  Pc  does  not  enter  into  8 .  Let  us  take  such  a  basis  c' 
and  construct,  relative  to  the  subgroup  G,  a  class  &(e'),  which 
we  will  denote  by  <£' . 

We  will  now  demonstrate  that  in  this  case  there  are  no  classes 
other  than  &  and  <$' .  For  any  basis  e"  of  L  there  will  be  nonsin¬ 
gular  matrices  P'  and  P"  such  that  e"  =  P"e  and  e"  =  P'e'. 
From  the  latter  equation  and  from  the  relation  e'  =  Pe  we  have 
c"  —  (P'P)c.  Hence  P"  —  P'P,  whence  det  P"  —  det  P'-detP. 
Since  del  P  <  0,  the  determinants  of  the  matrices  P'  and  P"  have 
different  signs,  which  means  that  one  of  them  is  positive.  If 
det  P"  >  0,  then  e"  e  <$\  if  det  P'  >  0,  then  e which  is 
what  we  wanted  to  establish. 


SUBGROUPS.  ORIENTATION 


185 


$  I] 

12.  Thus,  all  the  bases  of  the  space  L  are  divided,  with  respect 
to  the  subgroup  G(P  e  G  if  det  P  >  0),  into  two  classes. 

13.  If  two  bases  of  L  belong  to  one  of  these  two  classes,  they  are 
said  to  have  the  same  orientation.  Two  bases  are  said  to  have  op¬ 
posite  orientations  if  they  belong  to  different  classes. 

The  bases  of  one  of  these  classes  are  said  to  be  positively  orient¬ 
ed  (or  right-handed)-,  then  the  bases  of  the  other  class  are  said  to 
be  negatively  oriented  (or  left-handed) .  Either  one  of  the  two  clas¬ 
ses  can  be  chosen  as  the  class  of  positively  oriented  bases.  If  that 
choice  has  been  made,  then  we  say  that  an  orientation  has  been 
specified  in  the  space  L. 

14.  Quite  naturally,  it  is  not  necessary  to  deal  first  with  groups 
in  order  to  express  the  concept  of  orientation  of  a  space.  This  con¬ 
cept  can  be  explained  without  involving  groups  in  any  way,  which 
is  what  we  will  now  do. 

Let  . . .  and  e{',  ...,  ea'  be  two  arbitrary  bases  of  a 

space  L.  We  have 

=  (3) 

where  the  coefficients  P\ ■  constitute  a  nonsingular  matrix  P,  that 
is,  det  P  - 7^=0. 

If  det  P  >  0,  then  the  basis  e,-  is  said  to  have  the  same  orien¬ 
tation  as  the  basis  e,;  if  det  P  <  0,  then  the  basis  e,-  is  said  to 
have  the  opposite  orientation  of  the  basis  e,. 

15.  We  have  the  following  propositions. 

(1)  If  the  basis  e,'  has  the  same  orientation  as  the  basis  eu 
then  e,  has  the  same  orientation  as  *y.  Indeed,  by  (3),  the  vectors 
e,'  can  be  expressed  in  terms  of  et  by  means  of  the  matrix  P.  Con¬ 
versely,  the  vectors  et  can  be  expressed  in  terms  of  <y  with  the 
aid  of  the  matrix  P~l.  Thus  det  P~l  —  (det  P)'1  >  0. 

(2)  If  two  bases  have  the  same  orientation  as  a  third,  then  all 
three  have  the  same  orientation.  Let  the  vectors  ry  be  expressed 
in  terms  of  et  with  the  aid  of  matrix  P,  let  the  vectors  <y  be  ex¬ 
pressed  in  terms  of  e,  with  the  aid  of  matrix  P',  and  let  det  P  >  0 
and  det  P'  >  0.  Then  the  vectors  e,»  ;.r  ■  expressed  in  terms  of  e ^ 
with  the  aid  of  matrix  P'P~'  and  we  get 

det(P'P~‘)  =  det  P'  ■  detP-l>  0 

(3)  If  two  bases  are  oppositely  oriented  relative  to  a  third,  then 
these  two  bases  have  the  same  orientation.  If  del  P  <;  0  and 
det  P'  <  0,  then  det(P'P-‘)  >  0. 


GROUPS  ANb  SOME  APPLICATIONS 


[CM.  VI 


186 

16.  In  the  space  L  choose  an  arbitrary  basis  et  and  call  it  posi¬ 
tively  oriented  (right-handed).  We  say  that  any  other  basis  is  po¬ 
sitively  oriented  (or  right-handed)  if  it  has  the  same  orientation 
as  t>,.  We  say  that  any  basis  is  negatively  oriented  (or  left-handed) 
that  has  an  orientation  opposite  to  that  of  basis  e,-.  Thus  all  bases 
of  L  will  be  placed  in  two  classes.  By  the  three  propositions  proved 
in  Subsection  15,  any  two  bases  of  one  class  have  the  same  orien¬ 
tation;  any  two  bases  taken  from  different  classes  have  opposite 
orientations.  The  indicated  classes  have  the  same  status,  that  is, 
any  one  of  them  can  be  chosen  to  represent  the  class  of  positively 
oriented  bases.  If  that  choice  has  been  made  (by  indicating  the 
basis  e,),  then  we  say  that  an  orientation  has  been  specified  in  the 
space. 

17.  Note  in  conclusion  that  the  concept  of  orientation  is  essen¬ 
tially  connected  with  the  fact  that  a  basis  is  viewed  as  an  ordered 
collection  of  vectors.  If  the  numbering  of  the  vectors  of  a  basis  is 
altered  so  that  two  vectors  are  interchanged  while  the  remaining 
ones  retain  their  number  labels,  then  the  orientation  of  the  basis 
is  reversed.  Indeed,  let  the  bases  et  and  ey  be  connected  by  the 
relation  (3)  and  let  the  appropriate  change  be  made  in  the  num¬ 
bering  of  the  vectors  ey.  Then  in  matrix  P  there  will  be  an  inter¬ 
change  of  two  rows  and,  hence,  the  determinant  of  the  matrix  will 
change  sign. 

§  2.  Transformation  groups.  Isomorphism  and  homomorphism 
of  groups 

1.  Suppose  we  have  a  set  M  of  elements  that  we  agree  to  call 
points. 

We  say  that  a  transformation  is  given  of  the  set  M  if  to  every 
point  x  in  Af  is  associated  a  certain  point  y  of  the  same  set  M. 
Symbolically, 

y  =  f  (*) 

Here,  y  is  the  image  of  the  point  x  and  x  is  the  inverse  image 
of  y- 

Two  transformations  /  and  g  are  said  to  be  equal  if  g(x)  =  f(x) 
for  any  point  .v  e  M. 

A  transformation  f  is  said  to  be  one-to-one  if  every  point  y  <=  M 
is  an  image  of  some  unique  point  xe  M.  In  this  case,  the  trans¬ 
formation  which  to  an  arbitrary  point  y  —  f(x)  associates  its 
inverse  image  x  is  said  to  be  the  inverse  of  the  original  transfor¬ 
mation  /  and  is  denoted  by  f~': 

y  =  f(x),  x  =  r'{y) 


TRANSFORMATION  GROUPS 


187 


5  21 

The  transformation  e  is  called  the  identity  transformation  if 

e  (x)  —  x 

for  any  point  x  in  M.  It  is  clear  that  the  identity  transformation 
is  one-to-one  and  e~'  =  e. 

In  the  particular  case  where  M  is  the  number  axis  (line) 
—  oo  <  t  <  +  oo  the  concept  of  a  transformation  coincides  with 
the  concept  of  a  function  specified  on  the  entire  number  axis.  If 
the  function  t  —  f( t)  has  a  single-valued  inverse  g( t),  also  speci¬ 
fied  on  the  entire  number  line  —  oo  <  t<  +  oo,  then  this  inverse 
function  specifies  an  inverse  transformation  (symbolically:  f-'  —  g). 

2.  Let  qp,  /  be  transformations  of  the  set  M. 

The  product  of  <p  by  /  is  said  to  be  the  transformation  x  given  by 
the  formula 

xW=<p  1/  Ml 

for  any  point  x  in  M.  Symbolically  we  write  x  =  <P f- 

When  M  is  the  number  line,  the  product  of  the  transformation 
t  —  cp ( t )  by  the  transformation  t  —  f( t)  is  the  composite  function 
t  =  <p  [/" (t) ].  In  general,  a  product  of  transformations  is  not  com¬ 
mutative  (for  example,  sinh3  x  #  sinh  x3). 

For  any  transformation  f  of  an  arbitrary  set  Af  we  have  the  ob¬ 
vious  identities 

fe  =  ef*=f  (1) 


and  if  f  is  one-to-one,  then 

f~'f  =  e,  ff~'=e  (2) 

Confining  ourselves  to  one-to-one  transformations,  we  note  that 

if  fg  =  e  or  gf  =  e,  then  g  =  f~'  (3) 

A  product  of  transformations  is  associative,  that  is, 

=  (4) 

for  any  three  transformations  f,  q>,  ip  of  the  set  M.  This  is  clear 
since  each  of  the  transformations  (4)  operates  in  accord  with  the 
formula  y  =  r|>{<p[f (x)]}. 

If  the  transformations  <p,  f  are  one-to-one,  then  both  products  <pf 
and  fy  are  also  one-to-one,  and 

(<p  rV  (5) 

The  one-to-oneness  of  each  of  the  transformations  <p f  and  fqp  fol¬ 

lows  directly  from  the  one-to-one  character  of  q>  and  f;  formula  (5) 


Iflfl 


GROUPS  AND  SOME  APPLICATIONS 


[CII.  VI 


follows  from  (l)-(4)  since 

(r  V ')  (<?>/) = r  *(q p"  l<v)f=r'ef=rlf=e 

3.  From  the  definitions  and  properties  stated  in  Subsections  t,  2 
it  follows  that  all  possible  one-to-one  transformations  of  a  given 
set  M  constitute  a  group  with  respect  to  multiplication  (composi¬ 
tion)  of  transformations. 

4.  Definition.  Any  collection  G  of  transformations  of  a  set  M 
is  called  a  group  of  transformations  of  that  set  if  G  forms  a  group 
under  multiplication  of  transformations. 

The  third  axiom  implies  that  only  one-to-one  transformations 
can  constitute  a  group  of  transformations.  We  can  therefore  say 
that  every  group  of  transformations  of  At  is  a  subgroup  of  the 
group  of  all  one-to-one  transformations  of  the  set. 

5.  Throughout  this  chapter  we  consider  only  one-to-one  trans¬ 
formations  and  frequently  do  not  stipulate  this  condition. 

6.  Let  G  be  a  collection  of  transformations  of  a  set  M.  Since  (5) 
holds  for  any  three  transformations,  G  will  be  a  group  if: 

(a)  from  the  fact  that  two  transformations  f,  q>  belong  to  G  it 
follows  that  /qp  e  G  and  qp /  e  G; 

(b)  from  the  fact  that  a  transformation  /  belongs  to  G  follows 
the  existence  and  membership  in  G  of  the  inverse  transforma¬ 
tion  f~l. 

From  this  now  follows  the  membership  in  G  of  the  identity 
transformation  e\  it  is  the  unit  of  the  group  G  (in  this  connection 
see  formulas  (1)  and  (2)). 

7.  By  way  of  an  important  example  we  take  the  group  of  all 
nonsingular  (real  or  complex)  linear  transformations  of  n  variab¬ 
les.  In  this  case  the  set  M  is  an  n-dimensional  coordinate  real  or 
complex  space.  That  nonsingular  linear  transformations  constitute 
a  group  was  actually  shown  in  Section  3  of  Chapter  II.  There  it 
was  demonstrated  that  the  product  of  nonsingular  linear  trans¬ 
formations  is  a  nonsingular  linear  transformation,  and  the  inverse 
of  a  nonsingular  linear  transformation  is  a  nonsingular  linear 
transformation.  Thus  are  observed  the  conditions  (a)  and  (b)  of 
Subsection  6. 

8.  Let  G,  G'  lie  two  groups  and  let  the  group  G  be  mapped 
onto  G'.  Let  us  agree  to  use  primes  to  indicate  images:  for 
example,  a'  e  G'  is  the  image  of  the  element  a  e  G. 

Definition.  A  one-to-one  mapping  of  G  onto  G'  is  called  an  iso¬ 
morphism  if  the  image  of  a  product  is  equal  to  the  product  of  the 


5  21 


TRANSFORMATION  GROUPS 


189 


images;  symbolically, 

(ab)'-a'b'  (6) 

We  now  prove  that  under  an  isomorphism  the  image  e'  of  the 
unit  e  of  G  is  the  unit  of  the  group  G'.  Indeed,  let  a'  be  any  ele¬ 
ment  of  G';  it  corresponds  to  an  element  a  of  group  G.  We  thus 
have  ae  =  a  and  so,  by  the  definition  of  an  isomorphism, 

a'e'  =  {ae)'  =  a'  (7) 

or  e'  is  (he  unit  of  G' . 

If  an  isomorphism  exists  from  G  to  G\  then  the  groups  G  and 
G'  are  said  to  be  isomorphic  to  each  other.  Under  an  isomorphism, 
all  the  relations  between  elements  of  one  group  carry  over  to  the 
other  group.  Therefore  from  the  viewpoint  of  group  theory  the  two 
isomorphic  groups  have  the  same  structure.  It  is  enough  to  study 
one  in  order  to  know  the  other. 

Examples.  (1)  Let  G  be  the  group  of  all  real  nonsingular  n  X  n 
matrices  and  G'  the  group  of  all  real  nonsingular  linear  transfor¬ 
mations  of  n  variables  (viewed  as  transformations  of  coordinate 
space  K„).  We  associate  with  an  arbitrary  matrix  A  in  G  a  linear 
transformation  in  G'  having  the  matrix  A.  In  this  fashion,  the 
group  G  is  mapped  one-to-one  onto  G'.  This  mapping  is  an  iso¬ 
morphism  since  the  transformation  with  the  matrix  AB  is  the  pro¬ 
duct  of  transformations  having  matrices  A  and  B.  Under  this  iso¬ 
morphism,  all  group  relations,  in  particular,  all  subgroups,  carry 
over  from  G  to  G'.  For  instance,  the  subgroup  of  matrices  in  G 
whose  determinant  has  absolute  value  one  is  associated  with  a 
definite  subgroup  in  G',  which  consists  of  linear  transformations 
with  determinant  having  absolute  value  one.  Later  on  we  will 
examine  certain  other  important  subgroups  in  G  and  G'  that  cor¬ 
respond  to  one  another. 

(2)  If  the  linear  spaces  L  and  L'  are  linearly  isomorphic,  then 
they  are  also  isomorphic  as  groups.  Generally  speaking,  the  con¬ 
verse  is  not  true.  Indeed,  from  Sections  10,  11  of  Chapter  I  it  fol¬ 
lows  that  an  n-dimensional  complex  space  Cn  and  a  real  space  Lin 
of  dimension  2n  are  isomorphic  as  groups  (relative  to  the  opera¬ 
tion  of  addition  of  vectors),  yet  at  the  same  time  they  are  not  li¬ 
nearly  isomorphic  spaces. 

(3)  Using  Subsection  6,  we  can  easily  verify  that  the  collection 
of  linear  functions  t  =  At  for  all  possible  A  ¥=  0  forms  a  group 
of  transformations  of  the  number  line  —  oo  <  x  <  -j-  oo.  Denote 
this  group  by  G.  It  is  called  the  group  of  linear  transformations 
of  the  number  line.  Let  G'  be  the  group  of  real  numbers  A  (A  0) 
under  multiplication.  Associating  to  every  transformation  t  =  At 
a  number  A,  we  get  an  isomorphic  mapping  of  G  onto  G',  which  is 
a  special  case  of  Example  (1)  when  n  =  1.  In  Subsection  6  of 


190 


GROUPS  AND  SOME  APPLICATIONS 


[CH.  VI 


Section  1  we  indicated  two  subgroups  in  G'.  Associated  to  them 
in  G  are  two  subgroups  defined  by  the  conditions:  (1)  X=  ±  1, 
(2)  K  >  0. 

The  first  of  these  subgroups  consists  of  only  two  transforma¬ 
tions:  the  identity  mapping  of  the  number  axis  t  =  x  and  the  re¬ 
flection  t  —  — T. 

The  second  subgroup  consists  of  an  infinite  set  of  transforma¬ 
tions,  namely,  of  all  linear  transformations  preserving  the  direc¬ 
tion  of  the  number  axis. 

9.  Definition.  A  mapping  of  a  group  G  into  a  group  G'  is  called 
a  homomorphism  if  the  image  of  the  product  of  any  two  elements 
of  G  is  the  product  of  their  images  in  G'. 

In  other  words,  only  condition  (6)  must  be  obeyed.  Note  that  it 
may  happen  that  a'  =  b'  when  a  =t=  b  and  some  elements  of  G' 
are  not  images  of  any  elements  of  G.  We  symbolize  a  homomor¬ 
phism  by  G  -*■  G'. 

It  is  clear  that  an  isomorphism  is  a  special  case  of  a  homo¬ 
morphism. 

Another  special  case  of  a  homomorphism  is  obtained  if  we  asso¬ 
ciate  to  each  element  of  an  arbitrarily  chosen  group  G  the  unit  of 
some  group  G'.  Then  we  have:  a'  =  e',  b'  —  e',  ( ab )'  =  e'  =  a'b', 
and  (6)  holds. 

10.  Theorem.  Under  any  homomorphism  G  -*■  G',  the  image  of 
the  group  G  is  a  subgroup  of  G'. 

Proof.  Denote  the  image  of  G  by  G.  Let  a',  b'  be  arbitrary  ele¬ 
ments  in  G,  and  a,  b  certain  of  their  inverse  images.  By  (6),  G  is 
closed  under  multiplication: 

a'b'  =  (ab)'  e=  G  (8) 

From  (6)  it  also  follows  that 

a'(a~')'  =  (aa~ ’)'  —  e’  (9) 

As  in  Subsection  8  (see  formula  (7)),  it  is  established  that  the 
image  e'  of  the  unit  element  of  the  group  G  is  the  unit  of  G'.  The¬ 
refore  (9)  signifies  that 

(aT'  =  (a-'Y  eG  (10) 

The  relations  (8)  and  (10)  show  that  G  satisfies  the  definition 
of  a  subgroup. 

Remark.  It  can  be  demonstrated  that  the  collection  of  all  in¬ 
verse  images  of  the  unit  element  of  the  group  G'  under  the  homo¬ 
morphism  G  -*  G'  forms  a  subgroup  of  G  called  the  kernel  of  the 
homomorphism  G  -*■  G'.  We  will  not  dwell  on  the  proof. 


INVARIANTS  AND  PSEUDOINVARtANTS 


S  31 


191 


11.  We  now  consider  some  examples  of  homomorphisms  that  will 
come  in  handy  in  the  future. 

Let  G  be  the  group  of  all  nonsingular  real  n  X  n  matrices,  G' 
the  group  of  real  numbers  A  (X  0)  under  multiplication.  We  con¬ 
struct  the  following  mapping  of  G  into  G'\ 

(1)  to  each  matrix  A  e  G  is  associated  the  same  number  A  =  1; 

(2)  all  matrices  for  which  det  A  >  0  map  into  the  number 
A  —  +1;  all  matrices  for  which  det  A  <C  0  have  as  their  image  the 
number  A  =  —  1; 

(3)  let  a  be  any  fixed  real  number;  to  every  matrix  A  e  G  is 
associated  a  number  A  =  |  det  A  |°; 

4)  to  the  matrix  A  is  associated  A  =  |det  A\a  if  det  A  >  0,  and 
A  =  —  |  det  A  | a  if  det  A  <  0. 

We  have  the  homomorphism  G  -*■  G'  in  all  four  examples.  In  the 
first  example  this  occurs  because  the  entire  group  G  is  mapped 
into  the  unit  of  the  group  G',  in  the  other  three  cases  because  of 
the  theorem  on  the  determinant  of  a  product  of  matrices. 

Instead  of  the  group  of  numbers  A  (A  #  0)  we  can  take  the  iso¬ 
morphic  group  of  linear  transformations  t  —  Ax  (A  #  0)  of  the 
number  line.  Then  we  get  four  homomorphisms  in  which  the 
images  of  G  are  groups  of  transformations  of  the  number  line  con¬ 
sisting,  respectively,  of 

(1)  the  single  identity  transformation;  t  =  r, 

(2)  two  transformations:  t  —  t  and  t  =  — t; 

(3)  all  transformations  t  =  At  for  which  A  >  0; 

(4)  all  linear  transformations  t  —  At  (A  =A=  0). 

It  turns  out  that  the  above  four  types  of  mappings  of  G  and  G' 
given  in  this  subsection  exhaust  all  possible  homomorphisms  of  G 
into  G'.  This  assertion  will  be  made  essential  use  of  in  the  next 
section  (see  Section  3,  Subsection  8),  where  hints  with  respect  to 
the  proof  will  be  given. 

§  3.  Invariants.  Axial  invariants.  Pseudoinvariants 

1.  We  have  an  n-dimensional  linear  space  L,  which  we  assume 
to  be  real.  This  is  done  to  simplify  further  formulations.  Suppose 
in  L  we  choose  a  fixed  class  of  bases  %>  with  respect  to  some  sub¬ 
group  G  of  nonsingular  (real)  matrices. 

Besides  L  we  consider  a  set  T,  the  specific  nature  of  the  ele¬ 
ments  of  which  is  immaterial.  Actually,  for  T  we  will  have  collec¬ 
tions  of  geometric  entities  of  the  space  L  or  some  kind  of  al¬ 
gebraic  entities  connected  with  this  space. 

Let  there  be  given  a  numerical  function  a  of  two  arguments  — 
an  arbitrary  element  /  of  the  set  T  and  an  arbitrary  basis  e  taken 
from  the  class 


a  —  e) 


(1) 


192 


GROUPS  AND  SOME  APPLICATIONS 


|CH.  VI 


Suppose  that  the  values  of  the  function  (1)  are  real  and,  for  all 
possible  t  e  T,  fill  the  entire  number  axis  —  oo  <  a  <  +  oo  (for 
any  fixed  e). 

We  can  imagine  (as  is  frequently  done  in  geometry)  that  the 
right  member  of  (1)  is  a  symbolic  notation  for  a  function  of  the 
coordinates  of  entity  t.  Accordingly,  instead  of  (1)  we  can  write 

a  =  i|)(xl,  x2,  .  . .,  xN )  (la) 

where  x\,  x2,  . . . ,  xN  are  the  coordinates  of  t  relative  to  the  basis 
e,  that  is  to  say,  numbers  which  in  some  way  determine  the  entity 
t  when  the  basis  e  is  specified. 

Example.  The  entity  t  is  a  parallelogram  constructed  in  a  Eucli¬ 
dean  plane  on  an  ordered  pair  of  vectors  p,  q : 

P  =  {x „  jc2>,  q  =  {y\,y<) 

For  (la)  we  write  (in  this  case) 

a  =  xiy2  —  x2tjt 

Here  it  is  assumed  that  xlt  x2  and  y i,  y2  are  the  coordinates  (com¬ 
ponents)  of  p  and  q  relative  to  a  basis  e.  Then  xit  x2,  yu  yz  may 
be  taken  to  be  the  components  of  t  in  the  same  basis.  If  e  is  taken 
in  the  class  of  orthonormal  bases  of  a  Euclidean  plane,  then  the 
number  a  will  be  an  oriented  area  of  the  parallelogram  t. 

2.  Suppose  we  have  the  value  a  =  ty(t,  e)  and  suppose  that  we 
pass  from  the  basis  e  to  any  other  basis  e'  of  the  same  class  S: 

e'=--Pe,  P<=G 

We  will  require  that  the  number  a'  —  i| >(/,  e')  be  defined  by  spe¬ 
cification  of  the  number  a  and  the  matrix  P  without  any  more  in¬ 
formation  about  the  entity  t  and  the  original  basis  e.  In  other 
words,  we  assume  that  a'  is  a  function  of  a  and  of  the  elements 
P‘i-,  of  the  matrix  P  =  ||P<'||.  Symbolically, 

a'  =  f(a,  P)  (2) 

We  also  assume  that  for  every  matrix  P  in  G  the  function  (2) 
specifies  a  one-to-one  transformation  of  the  number  axis  —  oo  <; 
<  a  <  - f-  oo,  which  transformation  we  indicate  by  the  symbol  fP. 
In  place  of  (2)  we  will  often  make  use  of  the  equivalent  notation 

a'  —  fP  (a) 

3.  When  conforming  to  the  requirements  of  Subsections  1  and  2, 
we  will  say  that  a  scalar  quantity  a  has  been  specified  in  the 
space  L  on  the  set  T  relative  to  the  group  G.  Formula  (2)  is  cal¬ 
led  the  law  of  transformation  of  the  scalar  quantity  a. 


s  3)  INVARIANTS  AND  PSEUDOINVARIANTS  193 

4.  The  function  f(a,  P)  cannot  be  chosen  arbitrarily.  The  condi¬ 
tions  of  Subsection  2  impose  rigorous  restrictions,  the  essence  of 
which  consists  in  the  transformations  a'  —  fP(a)  constituting  a 
group.  More  precisely,  we  have  the 

Theorem,  the  law  of  transformation  of  a  scalar  quantity  is  a 
homomorphism  of  the  group  G  of  matrices  into  the  group  of  all 
one-to-one  transformations  of  the  number  line. 

Explanation.  Formula  (2)  indicates  that  to  every  matrix  PeO 
is  associated  a  transformation  a'  =  fP(a)  of  the  number  line 
—  oo  <  a  <  -f  oo.  The  theorem  asserts  that  this  correspondence 
is  a  homomorphic  mapping. 

Denote  by  H  the  set  of  transformations  a'  =  fP(a)  correspond¬ 
ing  to  all  possible  matrices  P  in  G.  Then  there  follows  from  the 
theorem  and  from  Subsections  4  and  10  of  Section  2  the 

Corollary.  The  set  H  is  a  group  of  transformations  of  the  num¬ 
ber  line  —  oo  <  a  <  +  oo. 

Proof  of  the  theorem.  Since  the  one-to-one  nature  of  each  trans¬ 
formation  a'  =  fp(a)  is  given,  and  all  one-to-one  transformations 
of  the  number  line  — -  oo  <  a  <  +  oo  constitute  a  group,  it  suffices 
to  verify  that 

fp’fp  —  fp'p  (3) 

for  any  matrices  P,  P'  in  G.  Take  an  arbitrary  basis  e  e  8  and 
consider  the  bases  e'  =  Pe,  e"  =  P'e'  =(P'P)e.  We  denote  by 
a,  a',  a"  the  values  of  the  scalar  quantity  (1)  relative  to  the 
bases  e,  e'  and  e",  respectively.  By  Subsection  2,  we  have 

a"  =  f(a',  P')  =  f(a,  P'P)  (4) 

From  (2)  and  (4)  we  get  the  condition  imposed  on  the  function  fi 

f(f(a,  P),  P')  =  f(a,  P'P)  (5) 

To  the  right  member  of  (5)  corresponds  the  transformation  fP’P. 
The  composite  function  on  the  left  of  (5)  is  associated  with  a 
transformation  equal  to  the  product  fP’fP.  Therefore  relation  (5) 
(where  a  is  any  scalar  and  —  oo  <  a  <  +  °°)  is  equivalent  to 
(3).  This  completes  the  proof. 

5.  Remark.  The  truth  of  the  theorem  may  be  demonstrated  by 
somewhat  more  pictorial  reasoning.  Let  the  matrix  PeG  yield 
a  transition  from  the  basis  e  to  the  basis  e',  and  the  matrix  P'  e  G 
a  transition  from  the  basis  e'  to  the  basis  e".  Then  the  matrix 
P'Pg  G  yields  a  direct  transition  from  e  to  e" .  Then  we  recalcu¬ 
late  the  values  of  our  quantity  by  proceeding  from  e  and  going 
over  to  e"\  once  via  e',  the  next  time  directly.  If  fP’fP  fP'P,  then 
we  get  different  results,  which  is  inadmissible  since  the  value  of 


7-061 


194 


GROUPS  AND  SOME  APPLICATIONS 


[CH.  VI 


the  quantity  must  be  uniquely  defined  for  each  entity  in  T  relative 
to  each  basis  of  the  class  S.  Hence  fp'fp  —  fp'p,  which  is  what  we 
wanted. 


6.  Every  homomorphism  of  a  group  G  into  any  group  of  trans¬ 
formations  of  the  number  line  specifies  a  scalar  quantity  in  the 
space  L. 

Let  us  go  into  this  matter  in  more  detail.  Let  a  homomorphism 
associate  to  a  matrix  PeC  a  one-to-one  transformation  fP  of  the 
number  line,  that  is,  a  function  a'  =  /P(a)  specified  on  the  entire 
line  and  having  an  inverse  function  also  specified  on  the  entire 
line.  Set  f(a,  P)  =  fP(a).  Then  from  (3)  follow  (4)  and  (5), 
whence  it  follows  that  a  is  uniquely  determined  in  all  bases  of  the 
class  S. 

We  have  to  construct  a  set  T,  that  is,  we  have  to  determine  the 
geometric  entities  t  on  which  a  scalar  quantity  a  would  be  speci¬ 
fied  with  a  given  law  of  transformation  f(a,  P).  This  can  be  done 
in  different  ways.  For  instance,  for  t  we  can  take  a  point  of  the 
number  line  with  coordinate  a  on  the  ordinary  Cartesian  scale.  At 
the  same  time  we  must  assume  that  an  arbitrary  basis  e  has  been 
chosen  in  the  class  S .  In  going  over  to  a  new  basis  e'  —  Pe,  we 
pass  to  a  different  scale  on  the  number  line  by  the  formula 
a'  =  fP(a).  We  will  assume  that,  relative  to  the  basis  e',  the  same 
point  t  is  associated  with  its  coordinate  a'  on  the  new  scale.  Then 
all  requirements  of  Subsections  1,  2  will  be  complied  with. 


7.  With  particular  frequency  we  encounter  so-called  linear  geo¬ 
metric  entities  or  linear  scalar  quantities,  which  are  characterized 
by  the  fact  that  the  transformations  fP  are  linear,  that  is,  the  trans¬ 
formation  law  (2)  is  of  the  form 

a'  —  f  (P)  a 

In  this  case,  in  place  of  (5)  we  have  a  simpler  relation: 

f(P)f(P')  =  f(PP')  (6) 

imposed  solely  on  the  matrices  P,  P'  (any  matrices  in  G). 

The  relation  (6)  can  readily  be  derived  at  once  without  resorting 
to  (5).  Indeed,  if 

a'  =  f  (P)  a  a"  —  f  (P')  a' 

then 

a"  =  f(P)f(P')a 
On  the  other  hand,  we  must  directly  have 

a"  =  f(PP')a 


Thus,  (6)  is  fulfilled. 


INVARIANTS  AND  PSEUDOINVARIANTS 


195 


§  3] 

8.  Now  let  G  be  the  group  of  all  real  nonsingular  «Xn  matri¬ 
ces  (n  fixed). 

As  h.as  already  been  mentioned  (without  proof)  in  Subsection  11, 
Section  2,  all  homomorphisms  of  G  into  the  group  of  real  numbers 
(under  multiplication)  reduce  to  four  types  of  mappings  that  are 
listed  in  Subsection  11,  Section  2. 

This  assertion  can  be  expressed  as  follows. 

If  a  numerical  function  f(P)  of  a  matrix  argument  satisfies  (6) 
for  any  P,  P'  e  G,  then: 

(1)  either  f(P)  —  1  for  all  P  e  G; 

(2)  or  f(P)  —  1  for  P  e  G  and  det  P  >  0,  and  f(P)  =  —  1  for 
P  <=  G  and  det  P  <  0; 

(3)  or  f(P)  =  |  det  P| 0  for  all  PeC  (here  a  is  a  given  real 
number); 

(4)  or  f(P)  =  ±  |  det  P  |°,  where  the  plus  sign  occurs  for  P  e  G, 
det  P  >  0,  the  minus  sign  for  P  e  G,  det  P  <  0. 

Note  that  the  cases  (3)  and  (4)  include,  in  particular,  the  cases 
(1)  and  (2)  when  o  =  0.  Note  also  that  we  do  not  consider  the 
trivial  case  where  f(P)  is  identically  zero:  f(P)  =  0  for  all  PeG. 

The  proof  of  this  assertion  can  only  be  given  with  the  help  of 
material  taken  from  subsequent  chapters,  and  so  we  give  the  proof 
in  a  special  appendix  (see  Appendix  1). 

However,  we  will  make  use  of  the  assertion  at  once.  It  will 
enable  us  to  enumerate  all  possible  types  of  linear  quantities.  Na¬ 
mely,  there  exist  only  the  following  four  types  of  linear  scalar 
quantities. 

(1)  Invariants,  which  are  quantities  that  do  not  depend  on  the 
choice  of  basis.  Their  law  of  transformation  is 

a'  =  a  (I) 

for  any  matrix  P.  The  group  H  consists  of  a  single  identity  trans¬ 
formation  of  the  number  axis. 

In  the  preceding  chapters  we  have  assumed  all  along  that  we 
are  dealing  with  such  quantities  (for  instance,  when  we  considered 
linear,  quadratic,  bilinear  and  multilinear  forms). 

(2)  Axial  invariants.  Their  law  of  transformation  is 

(  a  if  det  P  >  0 

fl,  =  l  -  a  if  det  P<0  (II) 

Here  the  group  H  consists  of  two  linear  transformations  a'  =»  a 
and  a'  —  —  a. 

The  name  “axial  invariants”  expresses  the  dependence  of  these 
quantities  on  the  orientation  of  the  coordinate  axes.  They  do  not 
change  when  passing  to  a  new  basis  with  orientation  preserved, 
but  they  change  sign  if  the  orientation  of  the  basis  is  reversed. 


7* 


GROUPS  AND  SOME  APPLICATIONS 


[Ctl.  VI 


m 


Example.  An  instance  of  an  axial  invariant  is  the  oriented  area 
of  an  oriented  parallelogram  of  a  Euclidean  plane.  This  quantity 
is  positive  if  the  pair  of  vectors  defining  the  parallelogram  and  its 
orientation  have  the  same  orientation  as  the  basis;  otherwise  it  is 
negative.  The  elements  of  the  set  T  are  all  possible  oriented  paral¬ 
lelograms  on  the  Euclidean  plane. 

(3)  Pseudoinvariants  of  weight  o.  Their  law  of  transformation 
is 

a'  =  a\  det  P  |°  (III) 

where  a  is  a  given  real  number. 

Here,  to  each  matrix  P  is  associated  a  linear  transformation 
a'  =  Xa  for  X  —  |  det  P\ The  group  H  consists  of  all  linear 
transformations  a'  =  Xa  with  positive  coefficient  X. 

Example.  By  Section  3  of  Chapter  IV,  the  determinant  of  the 
matrix  of  an  invariant  bilinear  form  transforms  by  the  law 

A'  =  A  (det  Pf 

Thus  the  quantity  A  is  a  pseudoinvariant  of  weight  a  =  2.  The 
set  T  in  this  example  consists  of  all  possible  invariant  bilinear 
forms  specified  in  the  space  L. 

(4)  Axial  pseudoinvariants  of  weight  a: 

|  a  |  det  P  |°  if  detP>0 
a'  =  |  '  (IV) 

I  —  a  |  det  P 1°  if  det/><0 

where  a  is  a  given  real  number. 

9.  If  G  is  some  group  of  real  n  X  n  matrices,  then  the  scalar 
quantities  with  transformation  laws  (I),  (II),  (III)  and  (IV)  are 
called,  respectively,  invariants,  axial  invariants,  pseudoinvariants 
of  weight  a,  and  axial  pseudoinvariants  of  weight  a  with  respect 
to  the  group  G.  In  the  case  of  the  group  of  matrices  for  which 
|detP|  =  l,  pseudoinvariants  do  not  differ  from  invariants  and 
axial  pseudoinvariants  do  not  differ  from  axial  invariants. 

But  if  wc  impose  the  condition  det  P  =  +  1,  then  all  four  clas¬ 
ses  of  quantities  become  indistinguishable  (they  reduce  to  inva¬ 
riants  with  respect  to  the  indicated  group). 

Note  in  passing  that  the  group  of  matrices  with  determinant 
equal  to  unity  is  ordinarily  called  the  unimodular  group  (both  in 
the  real  and  the  complex  case). 

10.  The  term  “invariant”  is  often  used  in  a  broader  sense  than 
that  of  Subsections  8,  9. 

Namely,  the  invariants  of  a  group  of  transformations  are  all 
entities,  properties,  and  quantities  that  are  preserved  under  any 
transformation  of  the  given  group. 


TENSOR  QUANTITIES 


197 


S  <1 

It  is  obvious  that  every  invariant  of  a  group  is  an  invariant  for 
any  subgroup  of  that  group.  The  converse  is  not  true:  an  invariant 
of  a  subgroup  may  not  be  an  invariant  of  the  entire  group.  In  this 
sense  we  can  say  that  the  broader  the  group  of  transformations, 
the  smaller  number  of  invariants  it  has,  but  such  invariants  reflect 
the  most  stable  and  profound  properties  of  reality. 

Geometry  breaks  down  into  a  number  of  divisions,  in  each  of 
which  one  investigates  the  invariants  of  a  definite  group  of  trans¬ 
formations  of  some  space.  For  example,  in  elementary  geometry 
we  consider  in  three-dimensional  Euclidean  space  the  properties 
of  figures  that  are  preserved  under  any  motion  of  the  figure  as  a 
rigid  body  (in  other  words,  invariants  of  the  group  of  motions  of 
three-dimensional  Euclidean  space).  In  the  chapters  that  follow 
we  will  examine  several  important  groups  of  transformations  and 
some  of  their  invariants. 

§  4.  Tensor  quantities 

1.  We  now  define  certain  classes  of  quantities  allied  to  tensors 
and  including  the  latter  as  a  special  case.  We  will  not  attempt  to 
interpret  these  quantities  geometrically.  All  we  plan  to  assume  is 
that  relative  to  every  basis  they  are  specified  by  a  specific  set  of 
numbers  called  components  (coordinates)  and  that  when  passing 
to  a  new  basis  these  numbers  (components)  transform  just  as  the 
coefficients  of  multilinear  forms.  So  as  not  to  complicate  our  des¬ 
cription  with  cumbersome  formulas,  we  will  suppose  that  the  com¬ 
ponents  (the  coordinates  of  the  quantities)  are  equipped  with  two 
indices  (one  lower  and  one  upper).  Accordingly,  we  will  consider 
forms  of  two  vector  arguments  (one  contravariant  and  the  other 
covariant).  The  passage  to  any  larger  number  of  indices  is  trivial. 

2.  Given  in  an  n-dimensional  linear  space  L  a  bilinear  form 
a(x,  u) ,  x  e  L,  u  e  L*.  If  L  has  a  basis  ex,  ....  e„,  and  L*  has  a 
reciprocal  basis  e\  . . . ,  en,  then  x  =  x'e,  -f- . . .  -f-  xne„,  u  =  uxe'  + 
-j ~...unen  and  the  form  a(x,u)  can  be  represented  component¬ 
wise  (in  coordinates): 

a(x,  u)  =  J^a^xun 

where 

a^  =  a(et,  <?*)  (1) 

When  passing  to  a  new  basis  we  have 

ei'=Y,Plvei,  ek  =Y.Qlek  (2) 

The  coefficients  of  the  component  representation  of  the  form  do 
not  change  in  the  process.  The  new  coefficients  a *'  will  be  expres- 


198 


GROUPS  AND  SOME  APPLICATIONS 


[CH.  VI 


sed  in  terms  of  tiie  old  coefficients  a*  in  one  way  or  another,  de¬ 
pending  on  the  nature  of  the  form  itself  as  a  scalar  quantity.  Na¬ 
mely,  it  may  happen  that  the  numerical  value  a(x,  u )  of  the  given 
form  on  an  arbitrary  pair  of  vectors  x,  u  will  be  replaced  by  a  new 
value  a'(x,  u).  Then  the  law  of  transformation  of  a(x,  u)  into 
a'(x,  u)  determines  the  law  of  transformation  of  a’f  into  a*'.  We 
assume  that  the  given  form,  being  a  scalar  quantity,  belongs  to 
one  of  the  four  classes  indicated  in  Subsection  7  of  this  section. 
We  accordingly  consider  four  cases. 

3.  (1)  The  form  a(x,  u)  is  an  invariant.  In  this  case 

ar —a' (er’  ek  )  —  a(ei'<  ) 

whence,  with  account  taken  of  (1)  and  (2),  we  get 

ar  =  (I) 

Thi£  is  the  familiar  law  of  transformation  of  the  components  of 
a  mixed  tensor  of  second  order. 

(2)  The  form  a(x,  u)  is  an  axial  invariant.  In  this  case 

ai:  =  a'(e.„  e*')  =  ±  a  (e4„  e*') 

where  we  have  the  plus  sign  if  det  P  >  0  and  the  minus  sign  if 
det  P  <  0,  whence,  taking  into  account  (1)  and  (2), 

al-  =  ±  T.a1P\’QX  (II) 

with  the  same  condition  regarding  the  sign  in  the  right-hand 
member. 

Quantities  whose  components  a!\  transform  by  law  (II)  are 
called  axial  tensors. 

(3)  The  form  a(x,  u)  is  a  pseudoinvariant  of  weight  a.  In  this 
case 

ai’=a'(er,  ek)  =  a(ey,  e*)|detP|° 
whence,  taking  into  account  (1)  and  (2), 

akr  =\  del  PfZatPrQk  (III) 

Quantities  whose  components  a*  transform  by  law  (III)  are 
called  pseudotensors  of  weight  a. 

(4)  The  form  a(x,  u )  is  an  axial  pseudoinvariant  of  weight  a. 
In  this  case 

ar=a‘(er,  ek’)  =  ±a(er,  efc')|detP|a 
and,  accordingly, 


ar  =  ±  I  det  P  |°  £a*P^' 


(IV) 


TENSOR  QUANTITIES 


199 


§4] 

where  on  the  right  we  have  plus  if  det  P  >  0  and  minus  if  det 
P  <  0. 

Quantities  with  components  a)  that  transform  by  law  (IV)  are 
called  axial  pseudotensors  of  weight  a. 

4.  The  transformation  laws  (I) -(IV)  for  a'f  were  derived  as 
a  consequence  of  the  corresponding  transformation  laws  (I)-(IV), 
Section  3,  for  the  scalar  quantity  a(x,  u).  Clearly,  the  converse  is 
also  true:  if  wt'f  transform  by  the  laws  (I) - (IV),  then  for  the  scalar 
quantity 

a(x,  «)=  YlakixiUk 

we  have,  respectively,  the  transformation  laws  (I)-(IV)  of  Sec¬ 
tion  3. 

5.  In  this  section  we  assume  that  the  transformation  (2)  is  de¬ 
termined  by  any  nonsingular  matrix  P.  We  can  suppose  that  the 
matrices  P  are  taken  from  some  group  G  while  the  admissible 
bases  constitute  the  appropriate  class  8.  Then  the  foregoing  defi¬ 
nitions  give  us  four  classes  of  tensor  quantities  under  the  group  G. 

6.  We  will  view  the  collection  of  numbers  af  as  a  point  in  the 
coordinate  space  K  (of  dimension  n2).  Then  any  one  of  the  four 
laws  (I)-(IV)  defines,  via  the  given  matrix  PeC,  a  certain 
transformation  of  the  space  K  (naturally,  the  appropriate  trans¬ 
formation  for  each  law  (I)-(IV)).  We  denote  this  transformation 
for  any  one  of  the  laws  (I)-(IV)  by  fP.  The  following  statements 
hold  true. 

(a)  The  set  of  all  fP  (P  e  G)  is  a  certain  group  H  of  transfor¬ 
mations  of  the  space  K- 

(b)  The  map  of  G  onto  H  under  which  the  matrix  P  e  G  is  as¬ 
sociated  with  the  transformation  fP  e //  is  a  homomorphism, 
namely, 

fp'fp  —  fp'p  (3) 

for  any  matrices  P',  P  e  G. 

The  proof  is  similar  to  that  of  Subsection  4,  Section  3. 

Relation  (3) is  very  important.  If  it  were  not  true,  then  the  laws 
(I)-(IV)  would  be  meaningless,  since  distinct  transitions  to  a 
new  basis  (either  directly  or  via  intermediate  bases)  would  yield 
unlike  results. 

7.  From  the  transformation  laws  (I)-(IV)  it  follows  that  if  all 
the  components  of  a  tensor  quantity  vanish  in  one  basis  then  they 
vanish  in  any  other  basis:  if  a*  =  0,  then  a£;=0. 


200 


GROUPS  AND  SOME  APPLICATIONS 


(CH.  VI 


8.  If  the  components  of  a  tensor  quantity  (of  any  one  of  the 
four  classes)  are  labelled  with  many  indices,  then  subscripts  are 
used  for  those  over  which,  in  the  transformation  laws  (l)-(IV), 
the  summation  is  with  the  upper  indices  of  elements  of  the 
matrix  P.  The  number  of  lower  indices  is  called  the  order  of  co- 
variance  of  the  tensor  quantity.  The  remaining  indices  are  written 
as  superscripts;  their  number  is  equal  to  the  order  of  contrava- 
riance. 

Remark.  The  scalar  quantities  of  any  one  of  the  four  classes 
mentioned  in  Section  3  may  be  viewed  as  tensor  quantities  (of  the 
appropriate  class)  of  order  zero. 

9.  Let  T  denote  the  set  of  all  tensor  quantities  of  one  of  the 
classes  (I)-(IV)  of  the  same  type  (that  is  to  say,  with  the  same 
number  of  upper  and  lower  indices).  Now,  if  linear  operations  are 
performed  on  the  elements  of  T  in  coordinate  space  (that  is,  if  we 
construct  the  sum  of  some  quantities  by  adding  their  appropriate 
components,  and  the  product  of  a  quantity  by  a  scalar  via  multi¬ 
plication  of  all  components  by  that  scalar),  then  we  get  tensor 
quantities  of  the  same  set  T.  This  is  evident  if,  for  the  sake  of  sim¬ 
plicity,  we  take  for  T  the  set  of  mixed  pseudotensors  of  order  two 
of  a  given  weight  a.  We  consider  two  pseudotensors  in  T  with 
components  a*  and  bt  relative  to  a  certain  basis.  In  the  new  basis 
we  get  at’,  bki'.  We  can  assume  that  a*  are  expressed  by  the 
equation  (III)  of  Subsection  3.  Similarly, 

b1'=\detP\°Zf’iPii'Qt'  (Ilia) 

Adding  (III)  and  (Ilia)  termwise,  we  get 

at’  +  bt'  =  I  det  P  \°  £  {at  +  bt)  PrQt' 

Thus,  the  sum  of  two  tensor  quantities  of  T  have  exactly  the 
same  transformation  law  as  each  one  of  the  quantities.  Multiply¬ 
ing  both  sides  of  (III)  by  an  arbitrary  scalar  a,  we  see  that  oaf, 
is  expressed  in  terms  of  aaf  by  the  same  law. 

10.  If  two  tensor  quantities  belong  to  T,  then  their  equality  is  of 
an  invariant  nature.  Let’s  look  at  this  in  more  detail.  Suppose,  for 
instance,  for  at  and  bt  we  have,  relative  to  a  single  basis,  the 
equations  bt  =  at  for  arbitrary  f,  k.  Then  relative  to  any  other 
basis,  bt'—at'.  This  is  evident  from  the  fact  that,  by  Subsec¬ 
tion  9,  the  difference  bt  —  at  is  a  tensor  quantity  and,  hence,  its 
vanishing  does  not  depend  on  the  basis. 

Remark.  Of  course,  in  a  single  basis,  the  equation  bt  —  at  is 
possible  for  any  quantities  at,  bt.  But  if  these  quantities  are 


§5] 


THE  DISCRIMINANT  TENSOR 


201 


taken  from  different  classes  (I)-(IV),  that  is,  if  they  have  differ¬ 
ent  laws  of  transformation,  then  the  equation  breaks  down  when 
passing  to  new  bases. 

11.  A  product  of  two  arbitrary  tensor  quantities  taken  from  any 
one  of  the  classes  (I) -(IV)  is  constructed  by  multiplying  each 
component  of  one  quantity  by  each  component  of  the  other  (re¬ 
lative  to  the  same  basis).  The  resultant  quantity  will  belong  to 
one  of  the  classes  (I)-(IV),  depending  on  the  choice  of  factors. 
For  example,  if  a?  is  a  pseudotensor  of  weight  a i  and  bf  is  a 
pseudotensor  of  weight  02,  then  a'fb'}1  will  be  a  pseudotensor  of 
order  four  and  of  weight  a  1  +  02-  Indeed,  under  these  assump¬ 
tions, 

al'  =1  det  Pf'ZalPrQL 
bf  =  I  det  P  T  £  bfP'i-QZ' 

Multiplying  these  equations  together,  we  get 

al'-bf  =  |  det  P  ri+°'  Z aki  bf  P^P^qX Qf 

12.  Note  that  the  product  of  two  axial  tensors  is  an  ordinary 
tensor.  The  product  of  an  ordinary  tensor  by  an  axial  tensor  is  an 
axial  tensor. 

13.  Contraction  on  a  single  upper  or  a  single  lower  index  of  a 
tensor  quantity  of  one  of  the  four  classes  (I)-(1V)  yields  a  ten¬ 
sor  quantity  of  the  same  class.  A  complete  contraction  reduces  the 
tensor  to  a  scalar  quantity  of  the  same  class.  For  instance,  the 
contraction  of  a  mixed  pseudotensor  of  weight  a  and  of  order  two 
is  a  pseudoinvariant  of  weight  a.  Indeed,  from  (III)  we  have 

Z  =  I  det  P  f  £  (a?  £  P^QO  M  det  P  |°  £  affii  =  |  det  P  |°  £ 

Thus,  for  the  quantity  a  =  Za“  we  have  a  transformation  law 
of  type  (III),  Section  3. 

§  5.  The  oriented  volume  of  a  parallelepiped. 

The  discriminant  tensor 

1.  A  basis  et,  ...,  en  is  chosen  in  an  n-dimcnsional  linear 
space  L,  which  means  an  orientation  of  the  space  is  given  (see 
Section  1).  In  L  we  take  an  ordered  n-tuple  of  arbitrary  vectors 
xu  . . . ,  x„  and  expand  each  one  with  respect  to  the  given  basis: 

x,  =x|c,  +  ...  +Xlen, 


Xn=Xne  1+  •••  +Ken 


202 


GROUPS  AND  SOME  APPLICATIONS 


[CH.  V! 


By  X  we  denote  the  matrix  made  up  of  the  coefficients  of  these 
expansions,  that  is,  the  components  of  the  vectors  Xi . x„  rela¬ 

tive  to  the  basis  e\,  . . . ,  en. 

With  the  ordered  n-tuple  of  vectors  x\,  . . . ,  x„  we  associate  a 
number  D(x i,  . . . ,  x„)  equal  to  the  determinant  of  matrix  X: 


D(x . xn)  — 

Changing  to  a  new  basis 


1  • 


xn 


Ci 


=  Z  p\ 


ei 


(0 


we  associate  with  the  same  n-tuple  of  vectors  x\,  ....  xn  a  num¬ 
ber  D'(x i,  . . . ,  xn)  equal  to  the  determinant  of  the  matrix  X'  made 
up  of  the  components  of  the  vectors  x\,  . . . ,  xn  relative  to  the  basis 

^l'»  •  •  •  »  &n' • 

It  is  easy  to  find  the  law  of  transformation  of  the  quantity  D 
(xt,  ...,  x„).  Namely,  together  with  (1)  we  have  the  following 
equations  for  the  components  of  any  vector: 

xl  =  Z  Qi  x‘ 


whence  we  get  the  matrix  equation 

X'  =  XQ * 


Hence 

Z)'(X| . x„)  =  det  X'  =  det  X  det  Q*  =  det  Q*D  (x . . 

But,  as  we  know,  Q*  =  P_1,  and  so  we  have  the  relation 
D' {x . . x„)  =  ±|detPr'D(x1,  ...,  x„) 


xn) 


(2) 


where  on  the  right  we  have  plus  if  det  P  >  0  and  minus  if 
det  P  <  0. 

We  see  that  D  (xi . x„)  is  an  axial  pseudoinvariant  of  weight 

a  —  — 1,  which  is  defined  on  all  ordered  n-tuples  of  vectors.  Note 
that  D  (xi,  . . . ,  x„)  >  0  if  the  vectors  x\ . x„  are  linearly  inde¬ 

pendent  and  the  ordered  n-tuple  xu  . ..,  x„  is  positively  oriented 
(that  is,  the  orientation  is  the  same  as  that  of  the  basis  et> . . . ,  e„). 


2.  From  the  properties  of  a  determinant  it  follows  immediately 
that  D  (xi,  ....  x„ )  is  a  multilinear  form,  that  is,  a  function  linear 
in  each  vector  argument. 

The  form  D  . . .  is  skew,  that  is,  it  is  skew-symmetric 

in  any  pair  of  arguments  (since  the  determinant  changes  sign  un¬ 
der  an  interchange  of  two  rows). 


THE  DISCRIMINANT  TENSOR 


203 


§  5] 


Expanding  the  determinant  by  its  definition,  we  get  a  component 
representation  of  the  form  D(x u  ....  x„)  relative  to  the  basis 
Ci,  . . , ,  en: 

D(x . *«)  =  £$/, ...  inx\'  ...  x‘" 

Here,  6/,.../  =0  if  there  are  any  identical  indices  from  among 
i‘i,  ....  in\  6 i|...(ri  =  +  1  if  i . ,  in  constitute  an  even  permu¬ 
tation  of  the  positive  integers  1,  2,  ...,  «;  and  61( ...  =  —  1  if 

the  permutation  of  i . .  in  is  odd. 

From  this  and  from  the  preceding  subsection  it  follows  that 
6/,...  in  is  an  axial  covariant  pseudotensor  of  weight  a  =  — l.  It 
is  also  clear  that  fii(  ...  *  is  skew-symmetric  with  respect  to  any 
two  indices. 

Remark.  It  is  possible  to  show  directly  that  if  we  subject 
8/,.../  to  a  transformation  via  a  purely  covariant  law  of  type 
(IV),  Section  4,  we  get  the  numbers 


V  .  i'=±|detP|7,E61. 

1  n  1 


(3) 


which  are  exactly  the  same  as  6,  t  .  Namely,  6(-  f-=0  if 

there  are  any  identical  indices;  6;'  =  rh  1  depending  on  the 

I  n 

parity  of  the  permutation  of  i\  ...  i'n.  To  illustrate,  note  only 
that  if  t'  =  1 ,  12  =  2,  ...,  i'n  =  n,  then  the  sum  in  the  right  mem¬ 
ber  of  (3)  is  equal  to  detP.  Therefore  6i'2' + 1.  The 
other  cases  are  left  to  the  reader. 


3.  Using  the  form  D  (X|,  . . . ,  x„),  we  can  construct  a  new  skew- 
symmetric  form  of  jci,  ....  x„,  which  will  then  be  an  axial  inva¬ 
riant,  that  is  to  say,  it  will  react  to  the  orientation  of  the  basis  in 
sign  alone  while  preserving  its  absolute  value  in  all  bases.  But  to 
do  this  we  will  have  to  invoke  a  certain  invariant  quadratic  form. 

Let  us  take  at  pleasure  some  invariant  quadratic  form  a(x,  x) 
with  the  sole  proviso  that  it  be  nonsingular.  Relative  to  an  ar¬ 
bitrary  basis  e\,  . . . ,  e„,  this  form  has  a  definite  component  repre¬ 
sentation  and  together  with  it  a  definite  matrix  A,  where  A  = 
=  det  A  =/=  0.  When  changing  to  a  new  basis  by  (1),  the  form 
a(x,x)  receives  a  new  matrix  A'.  If  A'  =  det  .4',  then,  as  we  know, 


whence 


A'  =  A  (det  P)2 
Vm  =  Vm  I  det  P  | 


(4) 


204 


GROUPS  AND  SOME  APPLICATIONS 


(CH.  V! 


Multiplying  (2)  and  (4)  termwise,  we  find 

Via' id'(*, . xn)  —  ± Vi aid(x„  ....  xn) 

Thus,  the  skew-symmetric  multilinear  form  ■y/\\\D(xu  ...,  xn)  is 
an  axial  invariant. 

4.  From  this  it  follows  immediately  that  relative  to  an  arbitrary 
basis  e\,  . . . ,  en  the  numbers 

e‘i  •••  =  VfAlfi/, ...  in 

are  the  components  of  an  axial  tensor.  It  is  called  the  discriminant 
tensor  of  the  form  a(x,  x).  Considerable  use  will  be  made  of  the 
discriminant  tensor  in  Chapters  VIII  to  X. 

5.  Let  8  be  a  real  n-dimensional  affine  space  corresponding  to 
an  n-dimensional  linear  space  L. 

Take  an  arbitrary  point  A  e  21  and  arbitrary  vectors  X\,  .... 
xn  e  L,  the  number  of  them  being  equal  to  the  dimension.  Then 
letM  be  a  point  of  the  space  51  defined  by  the  equation 

AM  =  t,*,  +  ...  +  raxa 

where  n,  . . . ,  x„  are  real  numbers.  If  xj . t„  vary  independently 

of  one  another  under  the  conditions  0  ^  t*  sg;  1  (k  =  1,  ....  n), 
then  all  possible  resulting  points  constitute  a  spatial  figure  which 
is  called  a  parallelepiped  constructed  on  the  vectors  jcj,  ....  jc„  ap¬ 
plied  to  the  point  A.  For  n  =  2,  the  parallelepiped  is  called  a  pa¬ 
rallelogram  (see  Section  8,  Chapter  III). 

Let  the  space  L  be  oriented  by  specification  of  the  basis 

. . .  Then  if  the  vectors  jcj,  . . . ,  xn  are  linearly  independent, 

the  parallelepiped  constructed  on  them  is  assigned  a  positive  or 
negative  orientation.  The  parallelepiped  is  said  to  have  a  positive 
orientation  if  the  ordered  n-tuple  of  vectors  x\ . xn  is  positi¬ 

vely  oriented. 

6.  We  wish  to  associate  with  every  parallelepiped  a  number, 
which,  by  analogy  with  three-dimensional  Euclidean  space,  it 
would  be  natural  to  term  a  volume  (in  the  two-dimensional  case, 
an  area).  Taking  into  account  this  analogy,  we  make  the  following 
requirements  concerning  the  desired  quantity. 

(1)  The  volume  must  depend  solely  on  the  vectors  x\ . x„ 

and  not  on  the  point  A. 

(2)  The  volume  must  be  a  positive  number  in  the  case  of  a  posi¬ 
tive  orientation  of  the  parallelepiped,  and  a  negative  number  in 
the  case  of  a  negative  orientation,  and  must  be  zero  if  the  vectors 


THE  DISCRIMINANT  TENSOR 


205 


5  5] 


X\,  . . . ,  xn  are  linearly  dependent  (in  which  case  the  entire  paralle¬ 
lepiped  lies  in  a  hyperplane). 

(3)  The  absolute  value  of  the  volume  must  be  an  invariant. 

(4)  Increasing  the  length  of  one  of  the  vectors  X\,  . . . ,  x„  a-fold 
increases  the  volume  a  times. 

(5)  If  =  x\  -f  x",  then  the  volume  of  the  parallelepiped  con¬ 
structed  on  the  vectors  xit  x2,  . . .  ,  x„  must  be  equal  to  the  sum  of 
the  volumes  of  the  parallelepipeds  constructed  on  x\,  xv  ....  xn 
and  x2,  ....  xn.  A  similar  property  must  hold  true  relative 
to  the  other  vectors  of  the  ra-tuple  xlt  . . . ,  xn- 

It  turns  out  that  the  indicated  requirements  actually  determine 
the  volume  as  a  function  of  xu  ....  xn.  Indeed,  they  signify  that 
this  function  must  be  a  multilinear  form  of  X\,  . . . ,  xn  that  is  skew- 
symmetric  in  every  pair  of  arguments.  As  a  numerical  quantity  it 
must  be  an  axial  invariant. 

But  these  very  same  properties  are  possessed  by  the  multilinear 

form  Vl  AIDfr . .  xn)  given  in  Subsection  3.  On  the  other  hand, 

by  what  was  described  in  Section  8  of  Chapter  V,  any  other  multi¬ 
linear  form  having  the  same  properties  is  proportional  to  the  form 
V|A|D(JC|,  ....  xn).  Thus,  if  the  volume  of  an  oriented  parallele¬ 
piped  constructed  on  the  vectors  x\,  . . . ,  xn  is  denoted  by 
V{xu  ...,xn),  then 

V  (*„  ....  xn)  =  C  Vl  A  I  D(xi . xn)  (5) 

where  C  is  any  invariant  constant  (different  from  zero,  natu¬ 
rally). 


7.  We  can  change  the  constant  C  and  the  form  a(ji,  x),  the  de¬ 
terminant  A  of  which  enters  into  (5).  However,  the  factor  CV|A| 
in  the  right  member  of  (5)  will  be  uniquely  defined  if  we  designate 
at  pleasure  a  parallelepiped  with  unit  volume,  that  is,  if  we  ar¬ 
bitrarily  take  the  linearly  independent  vectors  a\,  . . . ,  a„  and  re¬ 
quire  that  V(ai,  ...,  an)  —  1.  Thus  defining  CVlA|,  we  get  the 
formula 


V(xu 


x)==£Lxjj- 

•’Xn)  D(ah 


xn) 

an) 


Thus,  the  measurement  of  volumes  in  affine  space  is  uniquely  de¬ 
termined  by  an  arbitrary  choice  of  the  unit  of  volume.  A  motivated 
choice  of  the  unit  of  volume  is  naturally  done  in  linear  spaces 
equipped  with  a  metric  (see  Chapter  VIII  in  this  connection). 


8.  If  when  changing  bases  we  do  not  use  the  entire  group  of 
nonsingular  matrices  but  confine  ourselves  to  the  unimodular  sub¬ 
group  G,  then  there  will  be  no  need  for  the  quadratic  form  a(x,  x), 


20fi  GROUPS  AND  SOME  APPLICATIONS  [CM.  VI 

since,  relative  to  G,  the  quantity  D  (xit  . ..,  xn)  is  itself  an  inva¬ 
riant.  It  is  also  necessary  in  that  case  to  confine  oneself  to  bases 
of  a  certain  class  <$  relative  to  the  unimodular  subgroup  G.  Sup¬ 
pose  that  the  class  <£  has  been  chosen.  Then  putting 

V  (xi . xn)  =  CD(xu  . . .,  xn)  (6) 

we  get  the  volume  as  an  invariant  with  respect  to  the  unimodular 
subgroup.  Since  D(e i,  . . . ,  en)=  1,  it  follows  that  C  =  y0,  where 
Ko  =  V(eu  . . . ,  en)  is  the  volume  of  the  parallelepiped  constructed 

on  the  basis  vectors  . . .  en.  Also  note  that  in  (6)  we  can  take 

C  =  1.  Then  V (e\,  ....  e„)  =  1,  which  means  that  a  parallelepiped 

constructed  on  the  basis  vectors  . . .  e„  has  unit  volume.  In  this 

case,  all  the  bases  of  the  chosen  class  S’  have  the  feature  that  for 
them  V  =  +  1. 

9.  In  exactly  the  same  way,  the  auxiliary  quadratic  form  a(x,  x) 
is  not  needed  if  a  class  of  bases  S'(e)  is  considered  relative  to  the 
subgroup  of  matrices  with  unit  modulus  of  the  determinant.  In 
that  case,  the  volume  is  also  expressed  by  (6)  but  is  an  axial  in¬ 
variant.  As  above,  we  assume  that  C  =  1.  Then  a  parallelepiped 
constructed  on  the  vectors  of  basis  e  will  again  have  the  volume 
V  =  +  1,  and  all  bases  of  the  class  #  will  have  the  characteristic 
that  for  them  |  K|  =  1,  and  V  =  -f-  1  or  V  =  —  1  depending  on 
the  orientation  of  an  arbitrary  basis  of  the  class  <§ {e)  with  res¬ 
pect  to  the  original  basis  e. 

In  order  to  stress  the  dependence  of  the  sign  of  the  volume  on 
the  orientation,  one  often  uses  the  term  “oriented  volume  of  a  pa* 
railelepiped”. 


Chapter  VII 


LINEAR  TRANSFORMATIONS 
OF  LINEAR  SPACES 


§  1.  Generalities 

1.  Definition.  A  mapping  (map)  y  —  Ax,  x  e  L,  y  e  L,  of  the 
linear  space  L  into  itself  (or  onto  itself)  is  said  to  be  linear  if 

A  (ax,  +  px2)  =  a  A  (x,)  -f  p  A  (x2)  ( 1 ) 

for  any  vectors  x,,  x2  e  L  and  any  scalars  a,  p.  Here  and  hence¬ 
forth  numerical  factors  are  real  or  complex  depending  on  whether 
the  space  L  is  real  or  complex. 

The  mapping  y  —  Ax  is  also  called  a  linear  transformation  of 
the  space  L.  We  sometimes  say  that  A(x)  is  a  linear  operator 
in  L. 

Linear  transformations  represent  a  multidimensional  generali¬ 
zation  of  a  linear  function  of  a  single  numerical  argument 
f(x)=  kx.  Their  diversity  grows  with  increasing  dimensionality. 

In  the  notation  of  linear  transformations  the  parentheses  are 
ordinarily  dropped,  and  in  place  of  A  (x)  we  write  Ax. 

The  simplest  instances  of  linear  transformations  are:  the  identity 
transformation 

Ex  =  x  (2) 

and  the  zero  transformation 

Qx  —  0 

where  0  is  the  symbol  of  a  linear  transformation  that  associates 
to  every  vector  x  a  zero  vector. 

2.  The  product  of  any  two  linear  transformations  A  and  B  is  li¬ 
near: 

AB  (ctX]  -f-  p.Vo) ==  A  (aBx,  -f-  p/J.1'2) ==  aABx,  -j-  $ABx2 

3.  For  linear  transformations  we  define  the  operations  of  addi¬ 
tion  and  multiplication  by  scalars: 

(A  -f  B)  x  =  Ax  +  Bx,  (aA)  x  =  a  Ax 


(3) 


208 


LINEAR  TRANSFORMATIONS  OF  SPACES 


[CH.  VII 


II  may  readily  be  shown  that  the  transformations  A  -f-  B  and  aA 
are  also  linear  and  that  the  set  of  linear  transformations  of  a 
space  L  is  itself  a  linear  space.  The  role  of  the  zero  vector  in  the 
space  of  linear  transformations  is  played  by  the  zero  transforma¬ 
tion  0.  We  leave  the  proof  of  these  assertions  to  the  reader. 

4.  The  existence  of  three  operations — multiplication  of  linear 
transformations,  addition,  and  multiplication  by  scalars — makes 
it  possible  to  construct  polynomials  of  transformations: 

p  (A)  —  a.0An -)r  o-\An  '+  ...  -j- On-i-A -j- (4) 

where  a j  are  scalars;  the  powers  of  a  transformation  are  defined 
by  successive  multiplication:  A2  =  AA,  A3  =  AAA,  and  so  forth. 
For  any  transformation  A  it  is  assumed,  by  definition,  that 

A0  =  E  (5) 

so  that  the  term  anE  in  the  polynomial  (4)  plays  the  part  of  the 
constant  term. 

5.  Assuming  a  space  to  be  finite-dimensional,  let  us  introduce 
the  basis  eu  . . . ,  e„. 

Suppose  that  we  know  the  images  of  the  basis  vectors  Aek  rela¬ 
tive  to  the  given  basis,  that  is,  the  coefficients  of  the  expansions: 

Aek  =  X  4“<?a  (6) 

Then  we  know  the  matrix  of  the  quantities  At.  It  is  not  by  accident 
that  the  indices  (one  upper,  one  lower)  are  set  the  way  they  are. 
Below  we  will  show  that  A  is  a  tensor  of  that  order.  Put 


A\ 

A\ 

A] 

a]  . 

..  Al 

..  A* 

A'n 

A. 

..  a: 

The  symbol  *  is  employed  because  in  the  great  majority  of  ap¬ 
plications  of  linear  transformations  it  is  not  this  matrix  but  its 
transpose  that  occurs  and  so  the  unadorned  symbol  A  is  left  for 
the  transpose. 

We  now  show  how,  if  we  know  matrix  A,  we  can  compute  y 
for  any  x: 

ylet  =  Ax  =  A (X xkek) 

We  take  advantage  of  the  linearity  of  the  transformation  A: 

J/  =  X  xk  Aek  =  X  xkAkea 


GENERALITIES 


209 


S  I] 

Changing  the  designation  of  one  of  the  indices,  we  get 

y  =  Z  y‘ei  =  S  (Z  4U*)  et 

whence 

rt 

«/  =  Z  A[x\  i=\ . n  (7) 

A=  I 

This  coordinate  (component)  notation  is  equivalent  to  a  single 
matrix  equation: 

y  =  A<  (8) 

where 


a\  a)  . 
A]  a)  . 

.  A\ 

.  Ai 

,  *  = 

.V1 

X2 

.  y  = 

y1 

y2 

At  A]  . 

.  a ; 

x" 

yn 

It  is  easy  to  verify  that  different  matrices  A  and  B  specify 
distinct  linear  transformations,  relative  to  a  given  basis. 

Thus,  the  linear  transformation  y  =  Ax  of  vectors  of  the  space  L 
is  expressed  in  the  form  of  a  linear  transformation  of  the  va¬ 
riables  (7),  which  transformation  is  given  in  matrix  form  by  the 
very  same  equation  y  —  Ax.  It  is  termed  the  coordinate  (compo¬ 
nent)  representation  of  the  linear  transformation  A. 

6.  Using  formulas  (3)  and  (7),  it  is  easy  to  verify  that  in  the 
addition  of  linear  transformations  their  matrices  are  added,  and  in 
the  multiplication  of  a  linear  transformation  by  a  scalar,  the 
matrix  is  multiplied  by  that  scalar  so  that  the  space  of  linear  trans¬ 
formations  of  an  n-dimensional  linear  space  L  is  isomorphic  to 
the  space  of  n  X  «  matrices.  As  was  done  in  Section  2  of  Chap¬ 
ter  II,  we  can  show  that  when  two  linear  transformations  are  mul¬ 
tiplied  together,  so  are  their  matrices.  To  the  identity  transforma¬ 
tion  corresponds  the  unit  matrix  E,  to  the  zero  transformation  cor¬ 
responds  a  matrix  consisting  of  zeros.  By  the  foregoing,  equations 
(2) -(5)  may  be  regarded  with  equal  right  as  expressions  of  trans¬ 
formations  or  as  expressions  of  matrices. 

7.  It  is  appropriate  now  to  indicate  some  generalizations  of  the 
notions  introduced  in  this  section.  Given  two  linear  spaces  L 
and  L'.  A  linear  map  of  space  L  into  L'  or  a  linear  operator  from  L 
to  L'  is  a  function  y  —  Ax  which  associates  to  every  vector  x  e  L 
a  vector  y  in  L'  and  satisfies  the  condition  of  linearity  (1).  For 
■U  —  L  we  get  the  linear  transformation  defined  in  Subsection  1. 


LINEAR  TRANSFORMATIONS  OF  SPACES 


ICH.  VII 


210 

For  linear  operators  we  define  the  operations  of  addition  and  mul¬ 
tiplication  by  a  scalar  in  accordance  with  (3).  It  can  be  shown 
that  the  set  of  all  linear  mappings  of  L  into  L'  forms  a  linear 
space. 

If  each  of  the  spaces  L  and  L'  is  viewed  as  a  group  under  the 
addition  of  vectors,  then  any  linear  operator  L-+L'  is  a  homo¬ 
morphism.  If  L  and  L'  are  finite-dimensional  and  bases  have  been 
chosen  in  them,  then  the  linear  operator  L-+L'  is  given  by  a 
matrix  and  is  expressed  as  a  linear  transformation  of  the  vector 
components,  but,  in  contrast  to  Subsection  4,  the  matrix  is  in  ge¬ 
neral  rectangular.  When  the  dimensions  of  L  and  L'  coincide,  the 
operator  A  has  a  square  matrix. 

Given  three  linear  spaces  L,  L',  L".  We  consider  two  linear 
mappings: 

( 1 )  y  —  Bx,  where  x  e  L,  y  e  L'; 

(2)  z  =  Ay,  where  y  e  L',  z  e  L". 

The  product  AB  of  the  operator  A  by  the  operator  B  is  defined  by 
the  formula 

2  =  ABx  =  A  (Bx) 

and  maps  L  into  L".  The  linearity  of  AB  is  proved  as  in  Subsec¬ 
tion  1. 

§  2.  A  linear  transformation  as  a  tensor 

1.  We  assume  the  space  L  to  be  n-dimensional.  We  consider  the 
linear  transformation  y  =  Ax  of  the  space  L.  It  is  defined  inva¬ 
riant^,  that  is,  independently  of  any  bases  whatsoever. 

We  are  now  interested  in  the  tensorial  nature  of  a  transforma¬ 
tion.  In  L  we  pass  to  a  new  basis  ey,  . . . ,  en Then 

Aew  =  E  A%ea, 

i*  i 

How  is  Ak'  expressed  in  terms  of  A*?  It  is  not  hard  to  figure  out 
that  what  we  have  is  the  tensor  law  of  transformation  correspond¬ 
ing  to  the  arrangement  of  the  indices.  This  can  be  established 
without  any  calculations  whatsoever.  Indeed,  the  collection  of  all 
vectors  x  in  L  coincides  with  the  set  of  all  possible  first-order 
contravariant  tensors.  The  contraction  E  A\xkt  for  all  x  e  L, 
yields  a  first-order  contravariant  tensor  y‘,  to  which  corresponds 
one  very  definite  vector  r/  =  E  irrespective  of  the  basis  e,-. 
From  this,  on  the  basis  of  a  familiar  criterion  (Chapter  V,  Sec¬ 
tion  4,  Subsections  8,  9),  we  conclude  that  At  is  a  tensor,  and  we 
can  straightway  write  down  the  transformation  law  of  its  compo¬ 
nents; 

Ak'  =  E  tiPl-Q1; 

i.k 


(1) 


LINEAR  TRANSFORMATION  AS  A  TENSOR 


211 


§  2] 

Thus,  to  every  linear  transformation  is  invariants  associated  a 
tensor 

A  =  X  A'ifiiek 

in  T\  =  L®L*,  where  e1,  en  e  L*  is  a  basis  reciprocal  to 

®  i  >  •  •  •  >  . 

The  converse  is  also  true:  to  every  second-order  mixed  tensor  we 
can  associate  invariantly  a  linear  transformation,  since  the  con¬ 
traction  of  tensor  (2)  with  the  vector  x  =  xle,-\~  •••  +  xnen  yields 
a  contravariant  vector: 

y  =  y'e  i+  •••  +  ynen,  i/'  =  Z^Ufe 
which  is  independent  of  the  choice  of  basis  eh  . . . ,  en. 

2.  We  can  consider  linear  mappings  of  L  into  L*,  L*  into  L,  L* 
into  L*  and,  as  before,  prove  that  they  too  are  associated  with  se¬ 
cond-order  tensors. 

We  now  show  how  to  set  the  indices  with  the  knowledge  that 
the  transformation  is  a  tensor. 

For  example,  let  u  =  Ax,  where  x  e  L,  u  e  L*.  We  pass  to  the 
component  notation.  The  vector  x  is  contravariant  and  so  its  com¬ 
ponents  are  indicated  by  superscripts:  {*'*}.  The  letter  A  must  have 
a  lower  index  k  so  that  it  will  be  possible  to  effect  a  contraction 
on  k.  The  vector  u  is  covariant  and  so  its  components  are  indi¬ 
cated  by  subscripts:  {u,}. 

Therefore  the  contraction  must  yield  a  covariant  vector,  which 
means  that  the  other  index  on  A  must  also  be  a  subscript: 

«<•  =  Z  Aikxk 

This  notation  signifies  that  a  linear  mapping  of  the  space  L  into 
the  conjugate  space  L*  is  associated  with  the  doubly  covariant 
tensor  Ah- 

Similarly,  for  the  transformation  y  =  Bv  of  the  conjugate  space 
L*  into  L  we  have  a  component  representation  of  the  form 

t/  =  Z  Bi,vi 

so  that  the  appropriate  tensor  is  doubly  contravariant. 

3.  Let  A  be  a  linear  transformation  of  L.  The  matrix  A  of  this 
transformation  relative  to  the  given  basis  e\,  . . . ,  e„  is  written 
thus: 


A\ 

A\  .. 

.  A'n 

A  = 

A] 

A]  .. 

.  Al 

At 

A’l  . 

.  K 

212 


LINEAR  TRANSFORMATIONS  OF  SPACES 


[CH.  VII 


Passing  to  a  new  basis  er,  . . .,  en',  we  get  a  new  matrix  A',  whose 
elements  are  expressed  by  (1).  Now  let  us  write  (1)  in  matrix  no¬ 
tation.  It  is  best  to  write  out  the  matrices  in  full  so  as  not  to  make 
a  mistake  in  the  order  when  multiplying  them. 

The  rows  of  matrix  P  are  expanded  in  terms  of  the  upper  index: 


p\> 

Pv  ■ 

.  Pv 

p  = 

Pr 

Pi'  . 

.  Pi' 

Pn' 

Pn'  • 

.  .  Pi' 

The  rows  of  matrix  Q  are  expanded  in  terms  of  the  lower  index: 


Q !' 

Qi'  • 

..  Qn 

Q  = 

Qi 

Qi'  . 

■  Qn 

Q? 

Qf  ■ 

•  •  Qn 

Formula  (1)  can  be  rewritten  thus: 

Ai;  =  ZQ\'AlPt' 

l.  k 

whence  we  obtain  the  desired  matrix  expression 

A'  =  QAP* 
or 

^  =  QAQ-‘  (la) 


where,  as  usual,  we  have 

Q=(py' 

4.  Very  important  corollaries  follow  from  these  matrix  formu¬ 
las. 

Since  Q  is  a  nonsingular  matrix,  it  follows  from  (la)  that 
rank  A’  =  rank  A.  Thus,  the  rank  of  A  is  an  invariant.  Further¬ 
more,  another  invariant  is  the  determinant  of  the  linear  transfor¬ 
mation,  since 

dct  A'  —  det  Q  det  A  (det  Q)~ '  =  det  A 
Also  an  invariant  is  the  complete  contraction  of  the  tensor  A*: 

Y  =  A ]  +  A>  +  ...  +  An 

u 

which  is  the  trace  of  the  matrix  of  the  linear  transformation. 


§3]  RANk  AND  DETERMINANT  OF  A  TRANSFORMATION  213 

Note  that  when  we  speak  of  the  “determinant  of  a  matrix”  or 
the  “trace  of  a  matrix”  without  indicating  the  object  associated 
with  the  matrix,  the  question  of  invariance  is  not  clear.  For 
example,  neither  the  determinant  nor  the  trace  of  the  matrix  of  a 
bilinear  form  is  an  invariant. 

5.  Let  A  be  a  linear  transformation  of  a  space  and  let  the  same 
symbol  A  denote  the  matrix  of  this  transformation  relative  to  an 
arbitrarily  chosen  basis.  The  foregoing  subsection  enables  us  to 
give  the  following  definitions. 

(1)  The  rank  of  matrix  A  is  called  the  rank  of  the  transforma¬ 
tion  A. 

(2)  The  determinant  of  matrix  A  is  called  the  determinant  of  the 
transformation  A. 

(3)  The  trace  of  matrix  A  is  called  the  trace  of  the  transforma¬ 
tion  A. 

The  geometrical  meaning  of  rank  and  determinant  of  a  trans¬ 
formation  is  considered  in  the  next  section. 

§  3.  The  geometrical  meaning  of  the  rank  and  determinant 
of  a  linear  transformation.  The  group  of  nonsingular 
linear  transformations 

1.  Given  in  an  n-dimensional  vector  space  L  a  linear  transfor¬ 
mation  A.  Suppose  that  rank  A  =  r. 

Denote  the  image  of  L  by  Jl  or  by  A(L),  that  is,  the  set  of  ele¬ 
ments  y  of  the  form  y  —  Ax,  where  x  ranges  over  the  whole  of  L. 

Theorem  1.  The  set  M  =  A(L)  is  a  linear  subspace  of  dimen¬ 
sion  r  in  L. 

Proof.  We  have 

y  =  Ax  =  A(Y,  xkek )  =  Z  xkAek 

Hence,  JK  =  A  (L)  is  a  linear  hull  of  the  vectors  Aet . Ae„ ;  but, 

as  we  know,  the  linear  hull  of  a  given  system  of  vectors  is  a  sub¬ 
space  whose  dimension  is  equal  to  the  rank  of  the  system  of  vec¬ 
tors.  The  components  of  the  vectors  Aet . Ae„  form  the  rows  of 

matrix  A  so  that  the  dimension  of  A(L)  is  equal  to  the  rank  of  A. 
The  theorem  is  proved. 

2.  Denote  by  JP  the  total  inverse  image  of  the  zero  vector  0  un¬ 
der  the  transformation  A,  that  is,  the  set  of  all  vectors  x  of  space  L 
for  which  Ax  =  0.  The  set  JP  is  called  the  null  space  of  the  trans¬ 
formation  A. 

Theorem  2.  If  rank  A  =  r,  then  the  null  space  JP  of  the  trans¬ 
formation  A  is  a  subspace  of  dimension  n  —  r  in  the  space  L. 


2M 


LINEAR  TRANSFORMATIONS  OF  SPACES 


[CH.  VII 


Proof.  nGi1  if  and  only  if  Ax  =  0.  Writing  this  vector  equa¬ 
tion  in  components  in  terms  of  an  arbitrary  basis  e(,  ....  en,  we 
get  a  system  of  homogeneous  linear  equations  whose  rank  is  r. 


A\X  -j-  ...  -f-  A,tx  =  0, 
Alx'+  ...  -f  AnXn  =  0 


(1) 


According  to  Section  5,  Chapter  III,  the  set  of  vectors  whose 
components  satisfy  (1)  is  a  subspace  of  dimension  n  —  r,  which 
is  what  we  wanted  to  prove. 

3.  Theorems  1  and  2  permit  us  to  give  two  geometric  definitions 
of  the  rank  of  a  transformation  that  are  equivalent  to  the  original 
algebraic  definition  (Section  2,  Subsection  5). 

(1)  The  rank  of  a  linear  transformation  is  equal  to  the  dimen¬ 
sion  of  the  image  of  the  entire  space  L. 

(2)  The  rank  of  a  linear  transformation  is  equal  to  the  difference 
between  the  dimension  of  the  space  and  the  dimension  of  the  null 
space  of  the  transformation  (that  is,  of  the  complete  inverse  image 
of  the  zero  vector). 

4.  Let  r  <  n.  Consider  the  action  of  the  transformation  A  from 
the  geometric  point  of  view. 

Here  it  will  be  convenient  not  to  distinguish  between  a  linear 
space  L  and  the  corresponding  affine  space  21  and  to  identify  every 
point  of  21  with  its  radius  vector. 

Let  us  consider  the  nonhomogeneous  system  of  equations 

Y,AW=^yl,  /  =  n  (2) 


where  A  =  || 71/ 1|  is  the  matrix  of  the  linear  transformation  under 
consideration.  This  system  is  solvable  if  and  only  if  the  vector 
y  =  i/'t'i  -f- . . .  -f  y"en  belongs  to  the  subspace  J(  —  A(L).  For 
every  y  <=  J(  the  solution  set  of  system  (2)  forms  a  plane  of  di¬ 
mensions  n  —  r  parallel  to  the  subspace  N  (in  this  connection,  see 
Sections  6,  7  of  Chapter  III).  It  is  evident  that  every  point  of  the 
space  belongs  to  one  such  plane. 

Thus,  the  entire  space  splits  into  parallel  planes  of  dimension 
n  —  r,  each  of  which  is  mapped  into  a  single  point  of  the  sub¬ 
space  JK. 

5.  Definition.  When  rank  A  —  n,  the  transformation  A  becomes 
nonsingular. 


S3] 


RANK  AND  DETERMINANT  OF  A  TRANSFORMATION 


215 


We  can  give  other  equivalent  conditions  for  nonsingularity: 

(1)  det  A  0, 

(2)  J(  —  A(L)~  L, 

(3)  yf  =  0. 

Every  element  of  the  space  L  then  has  a  unique  inverse  image. 
This  can  be  verified  directly  by  solving  (2)  by  Cramer’s  rule.  De¬ 
noting  by  Ak  the  elements  of  the  inverse  matrix  A~\  we  get 

xl  =  Z  AW 

or,  in  symbolic  form, 

x  =  A~'y  (3) 

The  transformation  (3)  is  a  linear  transformation  inverse  to  the 
given  one. 

6.  Theorem  3.  The  set  of  all  nonsingular  linear  transformations 
forms  a  group  of  transformations  of  the  space  L. 

Proof.  From  the  theorem  on  the  rank  of  a  product  of  matrices 
(Chapter  II,  Section  4)  it  follows  that  the  transformation  AB  is 
nonsingular  if  A  and  B  are  nonsingular.  Furthermore,  det /l-1  = 
=  (det/t)-1  =?t  0  and  so  the  inverse  transformation  A~l  is  nonsin¬ 
gular.  Thus,  the  set  of  nonsingular  linear  transformations  of  L 
satisfies  the  definition  of  a  group  of  transformations  (see  Section  2, 
Chapter  VI). 

7.  Theorem  4.  In  n-dimensional linear  space ,  a  nonsingular  li¬ 
near  transformation  A  is  determined  uniquely  if  we  have  given 
an  arbitrary  system  of  n  independent  vectors  X\ ,...,  xn  as  inverse 

images  and  an  arbitrary  system  of  n  independent  vectors  yt . yn 

as  images. 

Proof.  We  take  the  vectors  X\,  . ..,  xn  for  a  basis  in  L  and  ex¬ 
pand  the  vectors  yu  .  •  • ,  yn  in  terms  of  this  basis: 

yk  =  <x  i+  •••  +ankxn  (4) 

Matrix  A  of  the  desired  transformation  is  uniquely  defined  rela¬ 
tive  to  the  basis  Xi,  ... ,  xn  by  specification  of  the  vectors  (4),  since 
its  columns  are  formed  by  the  components  of  these  vectors  (At  — 
=  al,  see  Subsection  5  of  Section  1  above),  det  A  0  since  the 
vectors  (4)  are  independent.  The  proof  of  Theorem  4  is  complete. 

8.  Let  us  determine  the  geometric  meaning  of  the  determinant  of 
a  linear  transformation  A. 

To  do  so,  we  make  use  of  the  notion  of  the  volume  of  a  paralle¬ 
lepiped  (see  Section  5,  Chapter  VI).  We  define  the  class  of  bases 


21G 


LINEAR  TRANSFORMATIONS  OF  SPACES 


[CH.  VII 


with  respect  to  the  subgroup  of  matrices  with  determinant  modulus 
unity  and  take  one  of  the  bases  e\,  . . . ,  en  of  that  class.  Denote 
by  V0  the  oriented  volume  of  a  parallelepiped  constructed  on  the 
vectors  eu  . . . ,  en  and  compute  the  oriented  volume  V  of  the  pa¬ 
rallelepiped  constructed  on  the  vectors  Ae\,  ....  Aen.  Taking  into 
account  that  the  components  of  the  vectors  Aet  form  the  columns 
of  matrix  A,  we  get,  by  (6)  of  Section  5,  Chapter  VI, 

V  =  V0detA 

Hence,  under  the  given  linear  transformation,  all  volumes 
change  the  same  number  of  times,  and  the  determinant  of  the 
transformation  is  the  coefficient  of  this  change.  In  the  case  of  a 
nonsingular  transformation,  we  get  V=^=0,  and  the  bases  e\,  . . . ,  en 

and  Ae\ . Aen  have  the  same  orientation  if  det  A  >  0  and  the 

opposite  orientation  if  detA  <  0.  If  a  transformation  is  singular, 
then  det  A  —  0,  the  vectors  Aeu  ...,  Aen  are  linearly  dependent, 
V  =  0.  Also  observe  that  (5)  can  be  derived  directly  from  the  theo¬ 
rem  on  the  determinant  of  a  product  of  matrices. 

§  4.  Invariant  subspaces 

1.  Definition.  A  subspace  L'  cz  L  is  said  to  be  an  invariant  sub¬ 
space  of  the  transformation  y  =  Ax  if  Ax  e  L'  for  every  reL', 
(In  symbols  we  can  write  A(L')cz  L'.) 

Examples  of  invariant  subspaces  are  the  subspaces  Jl  and  JV 
introduced  in  Section  3.  We  now  prove  this. 

(1)  Ax  e  Jl  for  any  vector  x  e  L,  in  particular  for  any  x  e  Jl, 
and  so  A  (Jl)  a  Jl. 

(2)  If  x  e  Jf ,  then  A x  —  0  e  Jf  so  that  A  (JV)  e  Jf. 

The  zero  subspace  (which  consists  of  the  single  vector  0)  is  in¬ 
variant  under  any  linear  transformation  A  since  A0  =  0. 

2.  Let  L'  be  a  subspace  invariant  with  respect  to  A.  Then  the 
transformation  A  does  not  carry  vectors  belonging  to  L'  outside  U. 
Thus  is  defined  in  L'  the  linear  transformation 

y  =  Ax,  x<=L',  y  e  L'  (1) 

We  will  say  that  the  transformation  A  specified  in  L  induces  the 
transformation  (1)  in  the  invariant  subspace  L'.  At  times  it  is  con¬ 
venient  to  denote  the  induced  transformation  by  a  different  symbol 
than  A,  say  A'.  Then  A'x  =  Ax  if  xe  L',  A'x  is  not  defined  if  x 
does  not  belong  to  L'. 

If  the  transformation  A  is  nonsingular,  then  the  induced  trans¬ 
formation  A '  is  nonsingular  as  well,  and  for  that  reason  A(L')  = 
=  A'(L')—L'.  This  is  clear  since  otherwise  there  would  be  a 
vector  xeL'cL,i:#0,  for  which  Ax  —  0. 


INVARIANT  SUBSPACES 


21? 


*4] 

3.  If  the  subspace  £  is  not  invariant  with  respect  to  A,  then 
there  is  a  vector  x  e  £  for  which  Ax  does  not  belong  to  £.  For 
this  reason,  A  does  not  induce  any  transformation  in  the  sub¬ 
space  £. 

4.  We  now  show  that  if  the  invariant  subspace  £'  is  known, 
then  it  is  possible  to  simplify  the  transformation  matrix  by  placing 
several  basis  vectors  in  £'. 

Let  the  vectors  et,  . . . ,  eh  e  £'.  Then  their  images  belong  to  £' 
and  can  be  expanded  in  terms  of  the  same  vectors: 

Ae\  —  A\e\  +  Aten 


Aek  =  Ale i  -f-  Aten 

In  general,  these  are  followed  by  longer  expansions: 

Aek+i  =  At+iei  +  ...  +-<4/i+i£ft  +  A\+\ek+\-{-  ...  +  4?+  \en 


Aen  =  Ale  i  -(-...  -f-  Ale^  Al+lek+i  +  . . .  Anen 

Thus,  in  the  case  at  hand,  the  transformation  matrix  (which  is 
the  transpose  of  the  matrix  of  coefficients  of  the  expansions  that 
have  been  written  out)  is  given  as  follows: 

i4*  ...  aIa\+ i  ...  aI 


...  A\At+\  . . .  Al 

—  a\X\  ...  aVh 

0  . 

-  A$+i  •••  K 

5.  Let  us  consider  an  important  special  case  where  the  space  £ 
is  the  direct  sum  of  two  nonzero  invariant  subspaces  £'  and  L": 

£  =  £'©£",  A(L')  c=  £',  A  (L")  c=  L" 

Choose  bases  in  £'  and  L": 

^[,  .  • . ,  6k  s  £  ,  e k .}.  i ,  • . . ,  en  £ 

Then  by  Theorem  4,  Section  14,  Chapter  I,  the  vectors  e\ . e„ 

form  a  basis  of  the  space  £.  Relative  to  such  a  basis,  the  computa¬ 
tions  of  the  preceding  subsection  are  applicable  to  fi,  ...,  eh  and 
to  eh+i,  ....  e„  as  well.  Therefore  the  matrix  A  decomposes  into 


218 


LINEAR  TRANSFORMATIONS  OF  SPACES 


[CH.  VI! 


two  autonomous  “boxes” 


kxk 

0 

0 

(n  —  k)  X  (n  —  k) 

These  “boxes”  are  matrices  of  linear  transformations  induced  on  V 
and  L". 

Thus,  the  study  of  the  transformation  of  the  space  as  a  whole 
reduces  to  the  study  of  its  operations  in  U  and  L". 

6.  In  Section  10,  below,  we  make  use  of  the  following  lemma. 
Lemma.  If  L  =  L'  ©  L"  and  the  subspaces  L',  L"  are  invariant 
under  A,  then  A  (L)  =  A  ( L ')  ©  A  ( L ") . 

Proof.  If  L  is  the  sum  of  L'  and  L”  (though  not  necessarily  the 
direct  sum),  then,  as  is  readily  verifiable,  A  (L)  =  A  (L')-\-  A  (L"). 
On  the.  other  hand,  because  of  the  invariance  of  L'  and  L"  we 
have 

A(L')czL',  A  (L")  cz  L" 

whence 

A(L')(]A(L")c=L'f\L" 

but  the  sum  of  L'  and  L"  is  a  direct  sum,  and  so  L'  (1  L"  =  0. 
Hence,  A  ( L ')  (1  A  ( L")—  0  and,  consequently,  A  (L')  +  A  ( L”)  = 
=  A  (L')  ©  A(L") ,  which  completes  the  proof. 

§  5.  Examples  of  linear  transformations 

Preliminary  remark.  When  examining  examples  of  transforma¬ 
tions,  it  is  convenient  not  to  distinguish  between  the  linear  space  L 
and  the  corresponding  affine  space  21  (as  is  done  in  Subsection  4 
of  Section  3). 

1.  Similarity  transformation.  The  space  L  has  any  dimension. 
The  transformation  is  given  by 


Ax  —  Xx 


for  any  x  and  a  fixed  X  called  the  coefficient  of  similarity  (expan¬ 
sion  factor  or  contraction  factor).  All  vectors  are  “stretched”  the 
same  number  of  times  (for  |^.|<  1  they  contract).  In  this  case 
every  subspace  is  invariant.  The  matrix  of  a  similarity  transforma- 


EXAMPLES  OF  LINEAR  TRANSFORMATIONS 


219 


§  5] 

tion  of  an  n-dimensional  space,  given  an  arbitrary  choice  of  basis, 
has  the  form 

l  0 

A=XE=  '•  ;  det  A  =  kn 

0  X 

The  identity  transformation  E  may  be  regarded  as  a  similarity 
transformation  with  coefficient  unity,  and  the  zero  transforma¬ 
tion  8,  as  a  similarity  transformation  with  coefficient  zero.  In  the 
space  of  all  linear  transformations  given  on  L,  similarity  trans¬ 
formations  A  =  XE  form  a  one-dimensional  subspace  (a  straight 
line  passing  through  the  points  0  and  E). 

2.  n  =  3.  Let  x  —  {x:1,  x2,  x3}  be  an  arbitrary  vector  and  y  = 
=  {y\  y2>  y3}  its  image.  We  give  the  transformation  y  =  Ax  by 

(  y'=x'+x2  +  x3, 

\  y2  =  x'+x2  +  x3,  (1) 

l  y3  =  2x'+x2-x3 

It  is  clear  that  the  transformation  is  singular  and  rank  A  —  2.  The 
image  of  the  entire  space  is  the  subspace  Jt  =  A(L)  given  by  the 
equation 

y2=y' 

Let  us  find  the  total  inverse  image  of  an  arbitrary  point  y'  —  a, 
y2  =  a,  y3  —  b  of  the  plane  Jft  —  A  (L)  (Fig.  29).  System  (1)  is 
consistent  for  the  indicated  values  of  y\  y2,  y3.  The  first  equation 
may  be  dropped,  leaving  two  equations  defining  a  straight  line: 

x1  +  x2  +  x3  =  a, 

2x'  +  x2  —  x3  ==  b 

which  we  denote  by  9*. 

The  straight  line  thus  found  is  its  desired  inverse  image.  For 
different  a  and  b,  all  such  straight  lines  are  parallel  and  cover  the 
entire  space.  In  Fig.  29,  AC  denotes  the  straight  line  which  is  the 
complete  inverse  image  of  8.  Points  not  in  the  plane  A(L)  do  not 
have  inverse  images. 

3.  n  —  2.  We  give  the  transformation  y  =  Ax  by 

r  y'  =  kx', 
l  y2=  x2 


220 


LINEAR  TRANSFORMATIONS  OF  SPACES 


[CH.  VII 


The  transformation  matrix  A  = 


k  0 
0  1 


Let  us  take  an  arbitrary 


point  M  and  its  image  M'.  The  line  segment  MM'  is  parallel  to 
the  x'-axis.  Extend  this  segment  to  intersection  with  the  x2-axis  at 
point  K  (Fig.  30).  Then  for  any  choice  of  M  we  have 


M.-k 

KM 


If  |fc|<l,  then  KM  contracts.  If  k  <  0,  then  the  points  M 
and  M'  lie  in  different  half-planes  x1  >  0  and  x'  <  0  (Fig.  31). 


This  kind  of  transformation  is  called  a  compression  along  the 
x'-axis  to  the  x2-axis.  When  |£|>  1  we  speak  of  a  stretching. 


4.  The  transformation 


with  matrix  A  = 
the  x'-axis  (for  \k 


(  ij'=x' 

\  y2=  kx> 

l  o  || 

q  J  is  a  compression  along  the  x2-axis  to 
>  I  we  actually  have  a  stretching,  see  Fig.  32). 


5.  The  transformation 


t2) 


y'  =  k>x' 
y2=  M2 


222 


LINEAR  TRANSFORMATIONS  OF  SPACES 


[CM.  VI  r 


=  A\A2  —  A2A{.  Therefore,  a  transformation  with  matrix  A  may 
be  obtained  by  compounding  a  compression  along  the  x'-axis  to 
the  x2-axis  and  a  compression  along  the  x2-axis  to  the  x'-axis  in 
any  order.  Transformations  of  this  type  are  frequently  encountered 
in  the  theory  of  elasticity. 


It  is  easy  to  verify  that  in  each  of  the  examples  of  Subsections 
3-5,  the  axes  x1  and  x2  are  invariant  subspaces. 

6.  Let  L  =  L'  ©  L".  The  dimension  of  L  is  arbitrary  and  the 
subspaces  U  and  L"  are  not  zero.  Let  L'  and  L"  be  invariant  sub¬ 
spaces  of  the  transformation  A  which  induces  in  L'  an  identity 
transformation  and  in  L"  a  similarity  transformation  with  coeffi¬ 
cient  X.  This  kind  of  transformation  is  called  a  compression  with 
coefficient  X  to  the  subspace  L'  in  the  direction  of  the  subspace  L”. 

Representing  x  in  the  form  x  =  x'-fx"  (r'et',  x"  e  L")  we 
get 

Ax  =  x'  +  Kx" 

Let  the  vectors  e{,  ....  form  a  basis  in  L',  and  the  vectors 
eA+i,  . .  . ,  en  a  basis  in  L” .  Then,  relative  to  the  basis  eu  ■  ■  ■ ,  en, 
the  transformation  matrix  A  is  of  the  form 


1 


A  = 


1 


0 


x 


0 


x 


EXAMPLES  OF  LINEAR  TRANSFORMATIONS 


223 


§  5] 


Note  that  the  linear  hull  L(ett,  of  any  subsystem  of  the 

basis  . . .  en  is  an  invariant  subspacc. 

7.  Putting  X  —  0  in  the  preceding  example,  we  get  a  transfor¬ 
mation  that  is  called  the  projection  of  space  L  onto  the  subspace  L' 
in  the  direction  of  the  subspace  L".  The  projection  may  be  deter¬ 
mined  directly  as  follows.  If  x  is  any  vector  of  L,  then  it  can  be 
uniquely  expressed  as  x  =  x'  +  x",  where  x'  e  L',  x"  <=  L".  Then 
the  projection  of  x  on  L'  in  the  direction  of  L"  is  Ax  —  x'.  The 
projection  is  a  degenerate  transformation;  L'  is  the  image  of  the 
whole  of  L,  L"  is  the  complete  inverse  image  of  the  zero  vector. 


8.  As  an  example,  we  now  give  a  construction  that  will  be  im¬ 
portant  for  what  follows.  Fix  a  basis  . . .  en  in  the  space  L. 

Relative  to  this  basis,  the  transformation,  which  we  denote  by 
Gn(X)  or,  more  compactly,  G,  will  be  given  by  the  formulas 


Ge  i  =  Xe\, 

Ge 2  =  Q\  -j-  XG‘2t 
Ge3  —  e2  +  Xe3, 

Gen  —  en~t  -f-  Xen 


(3) 


The  matrix  of  this  transformation  with  respect  to  the  basis 
e\ . en  is  called  an  n-dimensional  Jordan  submatrix  correspond¬ 

ing  to  the  number  X: 


X  1 
X 


Gn  (X)  = 


0 


0 


X  1 
X 


The  Jordan  submatrix  has  X  on  the  main  diagonal,  ones  on  the 
diagonal  above  the  main  diagonal,  and  zeros  elsewhere. 

Subspaces  of  the  form  L(e i,  ...,  <?,,),  k  <  n,  are  invariant;  in 
each  one  of  them  the  transformation  is  given  by  the  Jordan  sub- 
matrix  Gh(X)  of  dimension  k. 

It  is  obvious  that  the  transformation  G„(X)  is  degenerate  if  and 
only  if  X  —  0.  In  that  case,  JC  —  G  (L)—  L  (eu  ,  e„-i),  J?  = 

=  l 


224 


LINEAR  TRANSFORMATIONS  OF  SPACES 


ICH.  Vi  I 


Let  us  consider  in  more  detail  the  three-dimensional  case  when 
X  =  0.  The  transformation  A  =  G3( 0)  is  given  by  the  formula 


y' 

0  1  0 

X1 

y 2 

— 

0  0  1 

X2 

y3 

0  0  0 

X3 

y'  —  x2,  y2  —  x3,  y3  =  0 


Every  straight  line  parallel  to  the  x'-axis  and  intersecting  the 
plane  x1  —  0  at  the  point  (0,  a,  b)  is  carried  into  a  point  with  co¬ 
ordinates  (a,  b,  0)  located  in  the  plane  x 3  =  0.  If  the  axes  x',x2,  x3 


Fig.  34 


are  mutually  perpendicular,  then  we  can  assume  that  at  first  all 
the  space  is  projected  on  the  plane  ( x 2,  x3),  and  then  this  plane  is 
imposed  on  the  plane  (x1,  x2)  so  that  the  positive  x2-axis  merges 
with  the  positive  x'-axis,  and  the  positive  x3-axis  with  the  positive 
x2-axis  (Fig.  34). 

§  6.  Eigenvectors  and  the  characteristic  polynomial  of  a  trans¬ 
formation 

1.  Definition.  An  eigenvector  of  a  given  linear  transformation  A 
is  any  nonzero  vector  x  that  satisfies  the  condition 

Ax  —  lx  (1) 

where  X  is  a  scalar. 

The  number  X  is  called  the  eigenvalue  of  the  transformation  A 
that  corresponds  to  the  given  eigenvector  x. 


EIGENVECTORS 


225 


S  61 

For  brevity,  one  says  “X  is  the  eigenvalue  of  the  given  eigen¬ 
vector”.  An  eigenvector  goes  into  a  vector  collinear  with  it.  In 
real  space,  the  eigenvalue  shows  how  many  times  the  eigenvector 
is  “stretched”  (or,  when  |X|<  1,  “compressed”). 

It  is  easy  to  see  that  if  x  is  an  eigenvector,  then  ax  is  also  an 
eigenvector  for  any  a  =£  0  and  that  the  linear  hull  of  every  eigen¬ 
vector  constitutes  an  invariant  one-dimensional  subspace  (an  in¬ 
variant  straight  line). 

2.  In  many  problems  of  algebra  and  its  applications,  one  is  cal¬ 
led  upon  to  find  all  the  eigenvectors  of  a  given  linear  transfor¬ 
mation.  We  now  investigate  this  problem. 

We  consider  a  linear  transformation  y  =  Ax  and  also  the  iden¬ 
tity  transformation  £.  We  have  Ex  ==  x  for  all  x  e  L.  Therefore, 
condition  (1),  under  which  x  is  an  eigenvector  of  the  given  trans¬ 
formation,  can  be  written  as 

(.4  -  X£)  *  =  0  (2) 

Let  the  transformation  y  =  Ax  be  represented  relative  to  a  basis 
eu  . . . ,  en  by  the  formulas 

yk  =  l  A)x'  k=\ . n  (3) 

Since  the  unit  matrix  £  =  167  11,  it  follows  that  because  of  (3)  the 
relation  (2)  is  equivalent  to  the  following  system  of  homogeneous 
equations: 

£C4/-A6,V  =  0,  k=l . n  (4) 

where  xx,  ...,  xn  are  components  of  the  eigenvector  x  relative  to 
the  basis  eu  . . . ,  en  and  X  is  the  eigenvalue  of  x. 

Definition.  The  matrix  A  —  X£  of  system  (4)  is  called  the  charac¬ 
teristic  matrix  of  the  given  transformation  A,  its  determinant 

Ai-k  A\  ...  A'n 
p  (k)  =  det  (A  —  kE)  =  A>  —  k...  An 


|  Al  Al  ...  An-k 

is  called  the  characteristic  determinant  of  the  transformation  A. 

Obviously,  p(k)  is  a  polynomial  of  degree  n  in  X.  It  is  called 
the  characteristic  polynomial  of  matrix  A  (or  transformation  A). 

The  general  plan  for  solving  problems  involving  eigenvectors 
now  reduces  to  the  following.  First  of  all,  the  so-called  characte¬ 
ristic  equation 

p(k)  =  0  (5) 

8-661 


220 


LINEAR  TRANSFORMATIONS  OF  SPACES 


[CM.  VII 


is  formed.  Equation  (5)  is  necessary  and  sufficient  for  system  (4) 
to  have  nontrivial  solutions.  Therefore,  in  complex  space,  the  roots 
of  (5),  and  only  these  roots,  are  the  eigenvalues  of  the  transfor¬ 
mation  A.  In  real  space,  the  eigenvalues  are  all  the  real  roots,  and 
only  these  roots.  Suppose  that  all  the  roots  Ai,  ...,  An  have  been 
found.  For  the  sake  of  definiteness,  we  will  assume  to  be  dealing 
with  real  space.  Then  we  reject  all  complex  roots  and  run  through 
the  remaining  ones.  This  system  will  each  time  receive  definite 
numerical  coefficients. 

The  rank  of  the  resulting  system  will  be  a  number  r,  r  <  n,  so 
that  the  system  will  have  n  —  r  independent  solutions.  In  finding 
them,  we  thus  find  the  n  —  r  independent  eigenvectors  with  one 
and  the  same  eigenvalue,  which  is  equal  to  the  root  taken.  Their 
linear  hull,  with  the  zero  vector  eliminated  from  it,  yields  all  the 
eigenvectors  with  the  same  eigenvalue.  This  follows  from  the  theo¬ 
rem  on  the  solution  set  of  a  homogeneous  linear  system  of  equa¬ 
tions. 

Having  thus  gone  through  all  the  real  roots  of  the  characteristic 
equation,  we  are  able  to  find  all  the  eigenvectors  of  the  given 
transformation. 

In  the  case  of  complex  space,  we  have  to  go  through  all  the  roots 
Al,  •  •  •  ,  An- 

3.  Examples.  (1)  A  similarity  transformation  is  a  transformation 
for  which  all  nonzero  vectors  are  eigenvectors  with  the  same  eigen¬ 
value  equal  to  the  coefficient  of  similarity. 

(2)  The  transformation  G„(A0)  (see  Section  5,  Subsection  8) 
has  only  one  linearly  independent  eigenvector.  Indeed,  for  G„(A o) 
the  characteristic  polynomial  p(A)  =  (Ao— A)n  has  the  sole  root 
A  =  A0  of  multiplicity  n.  For  A  =  Ao  the  characteristic  matrix 
G„(Ao) — A 0E  is  of  rank  n —  1  (the  nonzero  minor  of  order  n  —  I 
is  obtained  by  crossing  out  the  left  column  and  lowest  row).  For 
this  reason,  a  system  of  type  (4)  made  up  for  the  transformation 
G„(A0)  has,  for  A  =  Ao,  only  one  linearly  independent  solution. 
From  formulas  (3),  Section  5,  it  is  evident  that  the  vector  et  is  an 
eigenvector. 

(3)  The  transformation  of  a  two-dimensional  real  plane 

yl=  xl  +  2x2,  "I 
j2  =  — r'-fx2  J 

does  not  have  eigenvectors  since  P(A)=A2 —  2A  +  3  does  not 
have  any  real  roots. 

We  leave  it  to  the  reader  to  find  the  eigenvalues  and  the  eigen¬ 
vectors  in  the  other  examples  of  Section  5. 


S  7)  THEOREMS  ON  THE  CHARACTERISTIC  POLYNOMIAL  227 

§  7.  Basic  theorems  on  the  characteristic  polynomial  and  eigen¬ 
vectors 

1.  Theorem  1.  The  rank  of  a  characteristic  matrix  is,  for  fixed  X, 
an  invariant  with  respect  to  a  change  of  basis. 

Proof.  The  theorem  is  a  consequence  of  the  general  proposition 
of  the  invariance  of  the  rank  of  the  matrix  of  a  linear  transforma¬ 
tion  since  the  characteristic  matrix  is  the  transformation  matrix 
A  —XE. 


2.  Theorem  2.  The  characteristic  polynomial  p(X)  is  invariant 
with  respect  to  the  transformation  of  the  basis. 

Proof.  Let  A  and  A'  be  matrices  of  the  given  transformation  re¬ 
lative  to  the  bases  eu  ...,  en  and  e\,  ....  e'n,  P  the  matrix  for 
changing  from  the  first  basis  to  the  second,  and  Q=(P*)-1.  By 
Section  2  we  have 

A' -XE  =  A' -XE'  =  Q(A-XE)Q~l  (1) 

(£'  =  E  since  the  identity  transformation  relative  to  any  basis 
has  a  unit  matrix).  From  (1)  we  get 

det  (A'  -  XE)  =  det  Q  det  (A  -  XE)  det  Q_l  =  det  {A  -  XE) 

Remark.  Let  us  write  the  characteristic  polynomial  as 
P(x)  =  (-1)" [xn - Plxn-'  +  p2Xn~2 -  ...  +(-l)npJ 

It  is  readily  verified  that  p\  is  the  trace  of  the  matrix  A,  pn  = 
=  det  A.  From  Theorem  2  follows  the  invariance  of  all  coefficients 
p(X),  in  particular  p\  and  pn.  We  have  thus  obtained  another 
proof  of  the  invariance  of  the  determinant  and  of  the  trace  of  the 
transformation  matrix. 

3.  Theorem  3.  If  L  =  L\  ©  L2,  E\  and  L2  are  nonzero  subspaces 
invariant  under  A,  then  p(X)=  Pi(X)p2(X),  where  p\(X),  p2(X)  are 
characteristic  polynomials  of  the  transformations  induced  in  L\ 
and  L2. 

Proof.  Let  e\,  ....  eu.  be  a  basis  in  Lu  eu+u  ...  ,en  a  basis  in  L2. 
By  Section  4,  the  matrix  A  — XE  relative  to  the  basis  et,  .  . . ,  en 
of  the  space  L  has  the  form 


aw  ^ 

a[2 

. . .  a i* 

a2\ 

CL 22  —  ^ 

...  a2k 

0 

II 

<< 

1 

ak  i 

ak2 

•  •  •  Qkk  —  X 

■ 

0 

®k+ 1 t  +  l  X  ...  flft  +  i  n 

ank+ 1  •  •  •  Orta  —  ^ 

8* 


228 


LINEAR  TRANSFORMATIONS  OF  SPACES 


|CH.  VII 


whence 

det  (A  -  XE)  = 

X 


a„  — 

X 

a\2 

a\k 

a2i 

Cl 22  ^ 

a2k 

ak2 

*  *  » 

afc+i k+\ 

—  X  ... 

ak  +  l  n 

a 

4  +  1  •  •  • 

ann  ^ 

=  P  lW  P2  4) 


4.  Theorem  4.  If  to  a  certain  eigenvalue  X  there  correspond  m 
linear  independent  eigenvectors,  then  their  linear  hull  L  is  an 
m-dimensional  invariant  subspace  and  the  transformation  induced 
in  E  is  a  similarity  transformation  with  coefficient  X. 

Proof.  Let  e\,  . . . ,  em  be  linearly  independent  eigenvectors  cor¬ 
responding  to  the  number  X.  Take  an  arbitrary  vector  x  in  L  = 
=  L  (e,,  .  .  . ,  em),  expand  it  relative  to  the  basis  e\,  ... ,  em  of  the 
subspace  L  and  apply  to  it  the  given  transformation  A  to  get 

Ax  =  A  (x'e{  -f-  . . .  +  xmem)  —  x'Aet  4-  ...  -f  xmAem 
—  +  ...-)-  xmXem  —  Xx^L 


whence  follows  Theorem  4. 


5.  In  applications  of  linear  algebra  an  important  part  is  played 
by  the  question  of  simplifying  the  matrix  of  a  linear  transforma¬ 
tion  by  choosing  a  suitable  basis. 

Theorem  5.  A  transformation  matrix  A  is  diagonal  if  and  only 
if  the  basis  consists  of  eigenvectors. 

Proof.  (1)  If  the  basis  e\,...,en  consists  of  eigenvectors  of  the 
transformation  A,  then 

Ae,  =  A.i6|, 

Ae2  —  ?.2p2, 


A‘’fi  Xne n 


where  Xi,  X2 . X„  are  eigenvalues  (in  general,  distinct).  The 

coefficients  of  the  right  members  of  (2)  form  the  matrix  A*,  which 
In  this  case  coincides  with  A: 


A'  = 


X| 

0 


0 


=  A 


K 


(3) 


§  81 


NILPOTENT  TRANSFORMATIONS 


229 


Relative  to  this  basis,  the  transformation  y  —  Ax  has  the  coor¬ 
dinate  (component)  representation 

i/i  =Mi. 

//: 2  =  hjXt, 

fin  ~ 


(2)  Given,  relative  to  a  basis  . . .  <?„,  a  matrix  A  in  diagonal 

form: 


Then  A'  =  A, 


0 


Ae  i  =  A\e\,  ' 

Ae-i  =  Aie-2, 

Ac  n  =  AnPn 


and  this  means  that  all  the  basis  vectors  are  eigenvectors  with 
Xi  =  A*.  The  theorem  is  proved. 

Remark.  The  transformation  (4)  may  be  represented  as  a  pro¬ 
duct  of  n  compressions;  first  a  compression  with  coefficient  A.i  to 
the  subspace  L(e2,  ...,  e„)  in  the  direction  of  L(e i),  then  a  com¬ 
pression  with  coefficient  X2  to  the  subspace  L{e i,  e3,  . . . ,  e„)  in  the 
direction  of  L  (e2),  and  so  on.  It  is  easy  to  verify  that  the  compres¬ 
sions  can  be  carried  out  in  any  order.  (When  |^|>  1  we  have  a 
stretching.) 


6.  The  examples  in  Subsection  3  of  the  preceding  section  show 
that  there  may  not  be  a  basis  of  eigenvectors,  in  which  case  the 
transformation  matrix  cannot  be  reduced  to  diagonal  form.  Just 
how  the  transformation  matrix  may  he  simplified  in  that  case  is 
considered  below  in  Sections  9,  10. 


§  8.  Nilpotent  transformations. 

The  general  structure  of  singular  transformations 

1.  In  this  section  we  consider  singular  linear  transformations  in 
n-dimensional  space  L  (it  is  immaterial  whether  the  space  be  real 
or  complex). 


230 


LINEAR  TRANSFORMATIONS  OF  SPACES 


[CH.  VII 


2.  Definition.  A  transformation  B  is  said  to  be  nilpotent  if 
B'>  =  0  for  any  positive  power  p. 

In  other  words, 

Bp a:  =  0  for  any  reZ,  (1) 

The  smallest  (natural)  number  p  for  which  (1)  holds  true  is 
called  the  height  of  the  nilpotent  transformation. 

Remark.  If  Bpx  —  Q  for  a  number  p  and  a  vector  j igL,  then 
for  x  and  any  integer  m  >  p  we  have 

Bmx  =  Bm~p  (Bpx)  =  Bm~pQ  =  0 

The  simplest  example  of  a  nilpotent  transformation  is  the  zero 
transformation  0;  its  height  is  equal  to  unity. 

Every  nilpotent  transformation  is  singular.  This  is  clear,  for 
if  Be  =  0,  then  det  (Bp)  —  ( det  B)p  —  0,  hence  det  B  =  0. 

However,  nilpotent  transformations  are  not  merely  a  special  case 
of  singular  transformations.  They  are  the  basic  element  in  the 
structure  of  every  singular  transformation.  Namely,  the  following 
theorem  holds. 

Theorem.  Let  B  be  a  singular  linear  transformation  of  the 
space  L.  Then 

L  =  Ll®L2  (2) 

where  L\,  Li  are  invariant  subspaces  and 

(1)  the  transformation  induced  in  L\  is  nilpotent-, 

(2)  if  the  subspace  Li  is  not  a  zero  subspace,  then  the  transfor¬ 
mation  induced  in  it  is  nonsingular. 

In  short,  we  can  say  that  B  is  nilpotent  in  L\  and  nonsingular 
in  Li.  In  particular,  the  transformation  B  is  nilpotent  in  L  if  the 
dimension  of  L\  is  equal  to  n(L\  =  L),  while  the  dimension  of  L2 
is  zero  (Li  —  0)  in  this  case  alone. 

The  proof  is  given  below  in  Subsection  4  because  it  is  based  on 
the  auxiliary  propositions  given  in  Subsection  3. 


3.  Consider  the  successive  powers  of  a  given  singular  transfor¬ 
mation  B: 


B,  B2,  B3 . Bk,  . 


Denote  by  JTk  the  null  space  of  the  transformation  Bh  and  by  Jtk 
the  image  of  the  entire  space  L  under  the  transformation  Bh.  Let  r* 
be  the  dimension  of  J(h-  By  Subsection  1  of  Section  3, 

=  rank  (fl*) 

Let  us  investigate  the  properties  of  sequences  of  the  subspaces 
and  (k  =  1,2,...), 


*8] 


NILPOTENT  TRANSFORMATIONS 


231 


(1)  For  any  k,  is  an  invariant  subspace  under  the  transfor¬ 
mation  B. 

Proof.  If  xeAffc,  then  there  exists  a  jet  such  that  Bhy  =  x, 
whence  we  have 

Bx  —  B  (Bky)  =  Bk+  'ij  =  Bk  (By)  e=  J[k 

(2)  We  have  the  following  inclusions-. 

L  ro  zd  Jl2  zd  . . .  id  J(k  J(k+ ,  zo  . . .  (3) 

Indeed,  the  inclusions  (3)  follow  from  the  preceding  property 
since  %/TCk^.\  B  (^tk)  —  . jfCk. 

(3)  The  relations 

n  >  r,  >  ...  >  rp-i  >  rp  =  rp+l  —  . . .  ^  rk  =  . . .  (4) 

where  p  is  natural,  hold  true.  At  the  same  lime, 
jHk  —  J(p  when  k  >  p 

A  nonsingular  transformation  is  induced  in  the  invariant  sub¬ 
space  J[p. 

Proof.  It  is  seen  from  (3)  that  r,  ^  rj+\.  Because  of  the  singu¬ 
larity  of  B  we  have  /q  <  n.  The  ranks  r ,  are  non-negative  and  so 
there  can  only  be  a  finite  number  of  strict  inequalities  in  (4).  Let  p 
be  the  smallest  natural  number  for  which  the  equation 

rP  =  rp+ ,  (5) 

holds  true.  If  rp  =  0,  then  clearly  rh  =  rv  —  0  and  Jlh  =  Jtv  =  0 
for  k  >  p.  Let  rp  ^  1.  Then  from  (5)  and  (3)  it  follows  that  B 
induces  in  the  subspace  Jtv  a  nonsingular  transformation,  that  is, 
—  B  (dl p)  —  jK p,  whence  Jl p± 2  —  B  (jK p+ 1 )  —  B  (A( p)  —  p. 
Similarly,  JKP+ 3  =  Jlv,  . . . ,  J[h  =  Mv  for  any  k  >  p.  At  the  same 
time,  rh  =  rp  for  k  >  p.  The  proof  of  the  third  property  is  com¬ 
plete. 

(4)  We  have  the  following  inclusions: 


JF  |  cr  A3 2  cz  . 

...  <=jTk<=A°k+l  c  ... 

(6) 

JFp  if  k  >  p 

(7) 

where  p  is  the  smallest  number  that  satisfies  the  condition  (5) . 

Proof.  The  inclusions  (6)  are  obvious,  for  if  x  e  then 
Bhx  —  0  and  Bh+lx  =  B(Bhx)  =  0,  so  that  x  e  JFk+ t. 

By  Theorem  2,  Section  3,  the  dimension  nh  of  subspace  Jfk  is 
equal  to  n  — rh.  As  long  as  the  ranks  rh  decrease  strictly  with  in¬ 
creasing  k,  the  dimensions  nh  increase  strictly.  For  k  ^  p,  all 


232 


LINEAR  TRANSFORMATIONS  OF  SPACES 


tCH.  V» 


ranks  rh  are  the  same;  so  also  are  the  dimensions  np,  np+u  . .., 
nh . From  this  and  from  (6)  follows  (7). 

(5)  For  any  k,  Jfh  is  an  invariant  subspace  of  the  transforma¬ 
tion  B.  The  transformation  induced  in  JCh  is  nilpotent. 

Proof.  The  invariance  of  JCh  for  k  =  1  follows  from  the  rela¬ 
tions  B(A\)  =  0G,f:,  and  for  k>\  from  the  inclusions  (6), 
since  B{J?k)  c  JCk-x  cr  JVk-  The  nilpotency  of  the  transformation 
induced  in  jfk  is  obvious  since  Bk  (tfk)  =  0. 

4.  Proof  of  the  theorem.  We  will  show  that 

l=jtp®j(p  (8) 

if  p  satisfies  condition  (5).  Since  B  is  nilpotent  in  JPp  and  nonsin¬ 
gular  in  JKP,  we  thus  obtain  the  desired  expansion  (2)  (L i  =  Jfp, 
12  =  Jlp).  We  know  that  the  sum  of  the  dimensions  of  Jfp  and 
Jl p  is  n  and  so  to  obtain  (8)  it  suffices  to  verify  that 

a%  n^p  =  e  (9) 

(see  Section  14,  Chapter  1,  in  this  connection). 

Equation  (9)  is  proved  by  contradiction.  Let  x  =#=  0,  x^JFpf] 
fl  JtP.  We  consider  the  vectors 

x,  Bx,  Bpx  =  e  (10) 

They  all  belong  to  JKP  (because  Jlp  is  invariant).  Denote  by  y  the 
last  of  the  vectors  of  system  (10)  that  is  different  from  Q(y  =  Bhx, 
where  k  is  a  number  such  that  0  ^  k  <  p).  Then  we  have 

y¥*Q,  By  =  0,  yezMp  (11) 

But  (11)  contradicts  the  nonsingularity  of  the  transformation  B 
in  the  subspace  Ap.  Thus  (9)  is  established  and  the  theorem  is 
proved. 

5.  Remarks.  (1)  From  Subsection  3  it  is  seen  that  the  height  of 
the  nilpotent  transformation  induced  in  the  subspace  Jfp  is  equal 
to  p  (here  and  above,  p  is  the  smallest  number  satisfying  the  con¬ 
dition  (5);  as  k  increases,  for  k  <  p,  we  have  an  extension  of  the 
subspaces  JCk  and  a  restriction  of  the  subspaces  Jlh\  when  k  ^  p 
the  subspaces  J\Fh  and  Jt*  no  longer  vary). 

Thus,  the  subspace  fCp  may  be  found  as  the  null  space  of  the 
transformation  Bh  for  any  k  ^  p.  Similarly,  Jlp  —  Bh(L)  for 
k  p. 

(2)  it  is  likewise  easy  to  see  that  for  k  <  p  the  intersections 
Jfu  fl  Jfu  contain  nonzero  vectors,  and  so  the  sums  Jfh  +  d(h  are 
not  direct  sums  and  do  not  exhaust  the  space  L. 


CANONICAL  BASIS 


233 


i  9! 

§  9.  The  canonical  basis  of  a  nilpotent  transformation 

1.  Let  us  consider  some  other  questions  relating  to  nilpotent 
transformations.  First  of  all  we  shall  need  some  terminology  that 
will  be  important  for  what  follows. 

We  say  that  the  vectors  a i,  a2,  . .  . ,  «/,  form  a  sequence  of  length 
k  relative  to  a  transformation  B  (which  is  not  necessarily  nilpo¬ 
tent)  if  these  vectors  arc  not  zero  vectors  and  if 

Ba^=a2,  Ba2  =  a3,  ....  Bak-t—ak,  Bak  =  Q  (1) 

We  will  say  that  a\  is  the  senior  or  first  vector  and  ah  is  the 
junior  or  last  vector  of  the  sequence  (1).  If  a  #  9,  Ba  =  0,  we 
will  say  that  a  forms  a  sequence  consisting  of  a  single  vector  that 
is  both  senior  and  junior  at  the  same  time. 

We  say  that  a  basis  of  space  L  is  canonical  relative  to  the  trans¬ 
formation  B  if  it  consists  of  a  single  sequence  or  of  several  se¬ 
quences  that  do  not  have  any  vectors  in  common. 

2.  We  note  the  following  properties  of  nilpotent  transformations: 

(1)  If  in  a  space  L  there  is  a  sequence  relative  to  a  nilpotent 
transformation  B  containing  k  vectors,  then  the  height  of  the  trans¬ 
formation  is  p  ^  k. 

Indeed,  in  a  sequence  of  type  (1),  Bh-'ai  =  ah  ¥*  0,  whence 
p  >  k  —  1. 

(2)  If  the  height  of  the  nilpotent  transformation  B  is  p,  then 
there  exists  a  sequence  relative  to  B  of  length  p  and  there  are  no 
longer  sequences. 

Proof.  By  the  definition  of  a  height,  in  the  space  there  is  a  vec¬ 
tor  x  such  that  Bp~'x  8.  Then  the  vectors 

x,  Bx,  B2x,  . . .,  B°~'x 

form  a  sequence  of  length  p  since  there  are  no  zero  vectors  among 
them  (otherwise  Bp-'x  would  be  a  zero  vector)  and  B(Bp-'x)  = 
=  Bpx  —  0.  By  the  preceding  property,  there  can  be  no  longer 
sequence  in  the  space. 

(3)  Every  sequence  is  linearly  independent. 

Proof.  Write  down  the  following  relation  for  an  arbitrary  se¬ 
quence  (1): 

+  ^2a2  +  •  •  •  +  ^kak  ~  8  (*) 

Operate  on  both  members  of  this  equation  with  the  operator  Bh~l 
to  obtain  XiaA  =  0,  since  Bh~'a\  =  a,„  =  0  for  i  >  1.  Since 

ah  #  0,  we  find  Xi  =  0.  Now  operating  on  (*)  with  the  operator 
Bh~2,  find  X2  =  0.  Continuing  the  process,  we  find  that  all  num¬ 
bers  Xi,  —  0.  The  contention  is  proved. 


234 


LINEAR  TRANSFORMATIONS  OF  SPACES 


[CH.  VII 


Corollary.  If  n  is  the  dimension  of  the  space  L  and  p  is  the 
height  of  the  nilpotent  transformation  in  L,  then 

p^n 

(4)  If  in  L  there  is  a  canonical  basis  for  a  transformation  B, 
then  B  is  nilpotent  and  its  height  is  equal  to  the  number  of  vec¬ 
tors  in  the  longest  sequence  of  that  basis. 

Proof.  Let  ex,  en  be  a  canonical  basis  and  let  k  vectors 
enter  into  the  longest  of  its  sequences.  Then  for  every  basis  vector 
we  have:  B^e,  =  0.  Let  us  take  an  arbitrary  vector  x  e  L,  expand 
it  in  terms  of  the  basis  ex,  ....  en,  and  apply  to  it  the  transfor¬ 
mation  Bh: 

Bkx=Bk(x'ei+  ...  +  xnen)  =  x'Bkel  +  ...  +xnBken  =  d 

This  means  that  B  is  nilpotent  and  its  height  is  p  ^  k.  On  the 
other  hand,  by  property  (1)  we  have  p  ^  k.  Hence  p  —  k. 

3.  Examples.  (1)  For  the  zero  transformation  0,  every  vector 
forms  a  sequence  of  length  k  —  1,  and  so  every  basis  of  the 
space  L  is  canonical  with  respect  to  0.  It  is  readily  seen  that  if  B 
has  height  p  =  1,  then  it  is  a  zero  transformation  (B  =  0). 

(2)  Consider  the  transformation  G„(0)  (see  Subsection  8,  Sec¬ 
tion  5,  for  A,  =  0) .  By  the  definition  of  G„(0),  there  is  a  basis 
ex . e„  consisting  of  a  single  sequence: 

G„  (0)  en  =  . G„  (0)  e2  —  e,,  Gn  (0)  ex  =  0 

From  this  and  from  property  (4)  of  Subsection  2  it  follows  that 
Gn(0)  is  nilpotent  and  its  height  is  p  =  n. 

Observe  that  the  matrix  of  this  transformation  relative  to  the 

given  basis  ex . en  is  the  singular  n-dimensional  Jordan  sub- 

matrix 


0  1 

0 

0 

i 

•  • 

•  • 

0  1 

0 

0 

(3)  Let  a  transformation  B  be  given,  relative  to  a  basis 
ex,  . . . ,  e„,  by  a  matrix  in  which  the  main  diagonal  accommodates 
several  singular  Jordan  submatrices  of  distinct  dimensions,  all 


CANONICAL  BASIS 


235 


S9] 

other  elements  being  zero.  We  write  this  matrix  symbolically  as 

G*.(  0)  0 

Gk,  (0) 

•  (2) 
0  Gkl  (0) 

Without  loss  of  generality,  we  may  assume  that  k\  ^  ^  ^  hi, 

since  by  renumbering  the  basis  vectors  we  can  achieve  a  permuta¬ 
tion  of  the  submatrices  on  the  diagonal  of  the  matrix  B. 

For  the  sake  of  pictorialness,  we  write  out  in  full  a  matrix  of 
type  (2)  with  three  submatrices  G*.  of  dimensions  =  4,  k2  =  3 
and  k3  =  1: 

0  10  0 
0  0  10 
0  0  0  1 
0  0  0  0 

0  0  0  0 
0  0  0  0 
0  0  0  0 

0  0  0  0 

The  one-dimensional  submatrix  G](0)  consists  of  the  single  scalar 
zero.  Within  each  of  the  submatrices  Gh( 0)  of  dimension  k  ^  2 
there  is  a  diagonal  above  the  main  diagonal  consisting  entirely  of 
units.  The  sequence  of  units  between  every  pair  of  adjacent  sub¬ 
matrices  is  broken  by  one  zero. 

Thus,  one  diagonal  of  the  matrix  (2a),  like  any  matrix  of  type 
(2),  consists  of  a  sequence  of  units  broken  by  zeros.  This  diagonal 
lies  immediately  above  the  main  diagonal  of  the  matrix.  All  other 
entries  of  the  matrix  are  zeros. 

Because  of  (2),  the  basis  e, . e„  is  canonical  and  it  consists 

of  l  disjoint  sequences  with  the  length  of  the  longest  being  k\. 
From  this  it  follows  that  B  is  a  nilpotent  transformation  with 
height  p  =  k\.  The  linear  hull  of  every  sequence  entering  into  a 
basis  is  an  invariant  subspace. 

This  example  includes  the  two  preceding  ones  as  special  cases. 
If  k\  =  1,  then  all  submatrices  are  one-dimensional,  each  consist¬ 
ing  of  the  single  scalar  zero,  and  the  entire  matrix  is  a  zero  matrix 


0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

1 

0 

0 

0 

0 

1 

0 

0 

0 

0 

0 

0 

0 

0 

0 

LINEAR  TRANSFORMATIONS  OF  SPACES 


[CH.  VII 


23fi 


and  B  is  a  zero  transformation.  If  l  =  1,  then  the  matrix  B 
consists  of  one  submatrix:  B  =  G„(0). 

It  is  quite  evident  that,  conversely,  if  for  a  transformation  B 
there  is  a  canonical  basis  and  disjoint  sequences  of  the  basis  are 
arranged  one  after  the  other,  then  the  matrix  of  the  transforma¬ 
tion  B,  relative  to  such  a  basis,  is  of  type  (2).  To  each  sequence 
corresponds  a  submatrix  G*a(0 )  whose  dimension  ka  is  equal  to 
the  number  of  vectors  in  the  sequence. 


4.  The  last  of  the  foregoing  examples  embraces  all  possible  nil- 
potent  transformations.  This  follows  from  the  basic  theorem  (Theo¬ 
rem  1)  which  we  will  soon  state  and  prove. 


5.  Lemma.  Given  a  system  of  vectors  which  is  a  union  of  several 
sequences.  Now  if  the  last  vectors  of  all  these  sequences  constitute 
a  linearly  independent  system,  then  the  given  system  of  vectors  is 
also  linearly  independent. 

The  proof  is  most  conveniently  carried  out  after  the  basic  theo¬ 
rem. 

Theorem  1.  For  every  nilpotent  transformation  there  exists  a 
canonical  basis  ( which  is  by  far  not  unique) . 

Proof.  We  carry  out  the  proof  constructively,  that  is,  we  will 
actually  show  how  to  construct  a  canonical  basis.  Let  B  be  a  nil- 
potent  transformation  of  height  p  in  an  n-dimensional  space  L. 

We  consider  the  familiar  sequence  of  inclusions 

JiC/jC  ...  cz  X  £  a  X  c  ...  czA°p  =  L 

where  Xk  is  the  null  space  of  the  transformation  Bk.  Construct 
the  following  subspaces: 

X2  =  B  (X2),  X3  =  B 2  (A\),  Xp  =  Bp-'  (Xp) 

We  have  —  Hence,  all  X2,  Xz,  ....  Xp  belong  to 

.4° j {X j  ci X ,)•  On  the  other  hand, 

X i  f !  =  B‘  UV,)  =  B‘~'B  (Xi+i)  c=  fi'-1  (Xt)  =  Xt 
(since  B(A'i+i)czjVi).  Thus 

Xp  cz  Xp- 1  cr  ...  czX2cz  A’i 


Let  k/  be  the  dimension  of  X{  and  the  dimension  of  A,l. 
Choose  in  a  basis  whose  vectors  are  denoted  by 


nV  pP  •  pP~l  pP~\  • 

’  •  ■  •  >  1  k  >  Ckp+V  -  -  1  ’  Ck  .> 


P-l 


02 

efc3+l> 


,2  .  p 

V  efc2+i’  •  • 


(3) 


The  semicolons  separate  any  two  groups  of  vectors,  each  of 
which  is  of  a  specific  nature.  The  groups  are  indicated  by  super¬ 
scripts. 


§9] 


CANONICAL  BASIS 


237 


The  choice  of  basis  (3)  is  done  under  the  following  conditions. 
For  the  first  group,  ef,  take  any  basis  in  X  p,  for  the  second 
group,  eg-1,  take  any  complement  ep  to  the  basis  X p_, ,  and  so  on. 

Since  ep  e  Xp  =  Bp~'  (jV,,),  all  these  vectors  have  inverse  ima¬ 
ges  under  Bp~'.  For  the  vector  ep  take  some  inverse  image  e\. 
We  have  el  —  Bp~'  (ej).  At  the  same  time  for  every  /(/=  1,  ...,  kp) 
we  obtain  a  sequence  of  length  p : 

el,  e]  =  B(e\),  ...,  ep  =  Bp~'  (e|); 

Bp  (el)  =  0 


Similarly,  for  kp  <  i^kp-u  the  vectors  eP~x  have  inverse  ima¬ 
ges  e\  under  Bp~ 2  because  for  these  values  of  i  we  have 
e?~'  <=  X p~'  =  Bp~2  (JfP- 1).  Accordingly,  for  every  i  (kp  <  i  <  kp- 1) 
we  obtain  a  sequence  of  length  p—  1:  e\,  e]  =  B(e j),  ...,  ep~'  — 

=  Bp_2(e]);  Bp~'(e\)  —  Q.  Continuing  this  process,  we  obtain 
a  system  of  vectcrs  that  can  be  written  compactly  in  the  fol¬ 
lowing  array: 


eg  ; 

Rp 

*r-0 

I  1 

.  p3  . 

•  »  » 
*3 

e2  ■  e'  • 

*2  ”l 

pp- 1- 

pp- 2  ■ 

V"« 

■  e 2  ■ 
■’  V 

Pi  * 

V 

ep-2. 

*p 

pP~ 3  • 

P~  1 

•  • 

’  V 

(4) 

p-  • 

V 

pi  • 

*P~\ 

Here,  in  each  row  only  the  last  representative  of  each  group  is 
written.  For  example,  of  the  group  ep,  ...,  el  only  eg  is  given 

in  the  first  row. 

Some  group  may  be  nonexistent  in  (4).  For  instance,  if  XP  coin¬ 
cides  with  Xp-i,  then  in  (4)  the  second  group  has  to  be  crossed 
out.  Also  observe  that  if,  say,  X2  =  X3  (/e2  =  lt3),  then  in  the 
second  row  of  (4)  the  group  e'ki  drops  out  and  the  next  row  is  of 
the  same  length.  The  sequences  in  (4)  are  arranged  by  columns 
and  move  upwards.  The  lower  index  may  be  regarded  as  the  num¬ 
ber  label  of  the  sequence,  the  upper  index  as  the  number  label  of 
the  vector  within  the  sequence.  Thus,  the  upper  row  accommodates 
the  last  vectors  of  the  sequences.  They  constitute  a  basis  in  X\ 
and  hence  are  linearly  independent,  whence,  by  the  lemma,  we 
conclude  that  all  the  vectors  of  (4)  are  independent.  We  now  prove 


238 


LINEAR  TRANSFORMATIONS  OF  SPACES 


[CH,  VII 


that  their  total  number  /  is  equal  to  n,  i.e.,  the  dimension  of  L. 
Clearly 

l  —  ti\ -\- k2 k3 kp  (5) 

(note  again  that  in  (4)  each  column  is  actually  a  symbolical  re¬ 
presentation  of  several  columns).  Let  p*  be  the  rank  of  the  trans¬ 
formation  Bh  on  the  subspace  Jfk+i  (which  is  invariant  under  Bh, 
since  Bh(JPh+ 1)  =  Wh+\  czJfh+i).  Noting  that  Jfh  is  the  null 
space  of  Bh  in  the  subspace  A’h+i,  we  have 

n\  —  n2  —  Pi,  ^2  =  Pi! 

n2  —  n3  —  P2>  ^3  =  P2i 


Wp— 1  tip  Pp— 1>  &p  Pp— 1) 

np  =  n 

whence  n=«i  -f-  pi+  . . .  -f  Pp-i  =«i  +  ^2+  ...  +  kp.  Consequently, 
l  —  n  and  the  theorem  is  proved. 


Fig.  35 

e’z 


Fig.  35  is  an  illustration  of  the  array  in  (4)  for  the  case  n  =  4, 
p  =  3,  ti\  —  2,  and  the  spaces  and  Xi  are  one-dimensional  and 
coincide. 

Remark.  If  we  wish  to  prove  merely  the  existence  of  a  canonical 
basis,  it  is  possible  to  confine  oneself  to  a  briefer  argument  and 
take  advantage  of  induction  on  the  height  of  the  transformation. 
Namely,  let  there  be  given  in  the  space  L  a  nilpotent  transforma¬ 
tion  B  of  height  p  +  1  ^  2.  Then  in  the  subspace  Jt\  =  B(L)  the 
transformation  B  has  height  p.  In  Jt\  let  a  canonical  basis  be 
found  (if  p  =  1,  then  any  basis  in  Jl\  will  be  canonical).  We  sup¬ 
plement  the  initial  vectors  of  the  sequences  of  this  basis  with  in¬ 
verse  images  relative  to  B,  thus  lengthening  each  sequence  by  one 
vector.  Then  we  complete  the  collection  of  these  vectors  of  the  se¬ 
quences  to  a  basis  in  the  subspace  JC\.  This  yields  a  system  of 


CANONICAL  BASIS 


239 


S  9] 

r\  +  «i  =  n  vectors  ( r,  =  rank  B  —  dimension  J(\)  that  is  inde¬ 
pendent  by  virtue  of  the  lemma  and,  for  this  reason,  forms  a  basis 
in  L,  which  is  clearly  canonical. 

6.  Let  us  now  return  to  the  proof  of  the  lemma.  We  can  always 
assume  that  the  given  system  of  vectors  is  written  as  (4).  Now 
form  an  arbitrary  linear  combination  of  all  vectors  of  this  system 
and  equate  it  to  the  zero  vector.  In  expanded  notation  we  get  the 
following  equation,  all  sums  in  the  left  member  of  which  are  taken 
only  over  the  lower  index: 

E  ofcf  +  E  of-'cf-'  +  ...  +  E  -f  £  a\e\ 

kp  kp—  I  "l 


+  E  af-'ef  +  E  a?"2*?"2  +  •  •  •  +  E  a#  +  ... 

ft„  ft-  ,  ft. 


kp 

kP- 1 

k2 

+  Z  a& 

+  E  <*!«{ 

kp 

kp- 1 

+  E  = 0 

b  1  1 

(6) 

P 


Here  aj  are  numbers  (the  coefficients  of  the  linear  combination). 
The  arrangement  of  the  sums  corresponds  to  the  array  in  (4).  In¬ 
dicated  under  the  summation  symbol  is  the  number  up  to  which 
the  summation  is  carried  with  respect  to  i,  with  the  summation 
beginning  with  the  number  written  under  the  preceding  summation 
symbol  of  that  row  (in  the  first  sums  of  each  row  the  summation 
begins  with  unity). 

If  some  group  of  columns  is  absent  in  (4),  then  in  the  corres¬ 
ponding  column  of  (6)  we  take  the  factors  ak  to  be  equal  to  zero. 

Now  let  us  act  on  both  members  of  (6)  with  the  operator  Bp~1 
to  get  from  the  last  row 

E  a! Bp~'  (e|)  =  0. 

kP 

or 

E  a\epi  =  6  (7) 

kr> 

(all  other  terms  of  the  sum  (3),  when  operated  on  by  the  operator 
Bp-',  yield  0). 

If  the  last  vectors  of  all  sequences  form  an  independent  system, 
then  the  part  e’,  ....  epk  of  this  system  is  also  independent.  Then 

from  (7)  we  have 

a!  =  0,  ....  <4  =0 

p 


LINEAR  TRANSFORMATIONS  OF  SPACES 


[CH.  VII 


2-10 


Now  in  (6)  cross  out  the  lower  row  and  then  operate  on  the  re¬ 
mainder  with  the  operator  B>'~2.  As  before,  we  find  that  the  scalars 
a2,  ctj  that  participate  in  the  next  to  the  lowest  row  are  all  equal 
to  zero.  Continuing  the  process,  we  find  that  in  general  all  aj  =  0. 
The  proof  of  the  lemma  is  complete. 


7.  Let  B  be  a  nilpotent  transformation  with  height  p.  Denote  by 
lj  the  number  of  sequences  of  length  /'  relative  to  some  canonical 
basis  of  B. 

Theorem  2.  For  every  j  ( j  ^  p)  the  number  lj  is  invariant  rela¬ 
tive  to  a  transition  to  another  canonical  basis  of  B. 

Proof.  For  the  basis  that  was  constructed  in  the  proof  of  the 
preceding  theorem,  we  have,  by  construction, 

lp  =  kp\  l,  =  ki~kl+],  2  </'<p;  /,  =  «,—  k2 

Consider  an  absolutely  arbitrary  canonical  basis.  Observe  that  the 
last  vectors  of  all  its  sequences  must  lie  in  Jf\.  Denote  the  total 
number  of  these  vectors  by  n\,  the  number  lying  in  X p  by  k'p,  the 
number  lying  in  X  p-\  by  k’p-\  and  so  on. 

We  have 

(8) 


since  in  each  case  the  number  of  independent  vectors  does  not 
exceed  the  dimension  of  the  space  containing  them.  Let  n'  be  the 
number  of  all  vectors  of  the  arbitrary  basis  under  consideration. 
Then,  by  the  proof  of  the  preceding  theorem  (see  (5)), 


n'  —  n\-\-k  2  +  •••  -F&p> 
n  —  n\  +  ^2  +  •  •  •  +  kP 

Since  n'  —  n,  from  (8)  and  (8a)  we  find  n\  =  nv  k'l  =  kj.  But 

/;=*;:  2 </<*  /:=<-*' 


(8a) 


Hence,  //  =  //  for  any  j.  The  theorem  is  proved. 

Remark.  The  gist  of  the  proof  of  this  theorem  may  be  stated  in 
two  words  thus:  the  dimensions  kj  and  «i  of  the  subspaces  X,  and 
X 1  are  invariant  by  the  definition  of  these  subspaces;  but  for  any 
canonical  basis,  all  lh  are  expressible  in  terms  of  kj  and  n j.  Hence 
also  invariant  are  all  4. 


8.  It  is  easy  to  express  4  in  terms  of  the  ranks  of  the  transfor¬ 
mations  B-i  on  the  given  space  L. 

Denote  the  rank  of  B1  on  L  by  rj.  By  the  foregoing  we  have 


«/  =  «/+ \~Pi 


#9) 


CANONICAL  BASIS 


241 


Yet 

iij  =  n  —  r ,,  /»/+,  =  «  — r/+ , 

From  these  equations  we  get 

P  i  =  r,-rni 

Thus,  for  2<;  <  p, 

lj  —  kj  —  kj+ 1  =  p/-i  —  ()/  =  r/_|—  2r/  -f-  r/  H  (9) 

Besides, 

/,  =  n  —  2r,  +r2,  lp  =  rr-t  (9a) 

Remark.  To  each  sequence  of  a  canonical  basis  corresponds  a 
Jordan  submatrix  in  the  matrix  (2).  Therefore  formula  (9)  expres¬ 
ses  the  number  4  of  Jordan  submatrices  of  dimension  k  in  the 
matrix  (2)  for  all  values  of  k  (1  ^  k  sg  n).  Also  note  that  assum¬ 
ing  r0  =  n  (as  the  rank  of  B°  =  E)  and  rh  —  0  for  k  ^  p  (since 
for  k  ^  p  we  have  Bh  =  0),  we  can  confine  ourselves  to  (9)  in¬ 
stead  of  (9)  and  (9a): 

lk  =  rk-i  —  2rk  +  rk+l  (9) 

here  we  can  take  any  k  ^  1. 

9.  Observe  an  obvious  fact  that  will  be  used  in  the  sequel. 

A  transformation  B  is  singular  if  and  only  if  it  has  an  eigen¬ 
value  of  zero. 

10.  Theorem  3.  For  a  linear  transformation  B  in  n-dimensional 
space  L  to  be  nilpotent  it  is  necessary  and  sufficient  that  its  cha¬ 
racteristic  polynomial  be  of  the  form  p(X)  =  ( — A.)n. 

Proof.  Necessity  follows  from  Theorem  1,  since  in  the  case  of  a 
nilpotent  transformation  B  the  characteristic  matrix  B  —  XE  in  the 
canonical  basis  has  elements  (—X)  on  the  diagonal  and  zeros 
below,  and  so 

p(X)  =  del(B-XE)  =  (-X)n 

Sufficiency  will  not  be  proved  here  because  it  is  a  consequence  of 
the  following  more  general  theorem. 

11.  Let  B  be  a  singular  transformation  and  let 

p  (A,)  =  ( —  1)"  Xm'  (X  —  A,_>)m-'  ...  (X-Xi)ml  (10) 

be  its  characteristic  polynomial.  The  roots  X2,  ...,  Xj  (in  general, 
complex)  are  all  distinct. 

By  the  theorem  of  Section  8  we  have 

Z.  =  L|®L2 


(11) 


242 


LINEAR  TRANSFORMATIONS  OF  SPACES 


[CH.  VII 


with  B  nilpotent  on  L\  and  nonsingular  on  L2  ( Lx  and  L2  are  in¬ 
variant  under  B). 

Theorem  4.  The  dimension  of  L\  is  equal  to  the  multiplicity  mx 
of  the  zeroth  root  of  the  polynomial  p(l).  On  the  subspace  L2  the 
transformation  B  has  characteristic  polynomial  p2(k)  =  (—  l)n-m‘X 

X(*  — *2)"’  (A.  —  A/)m/. 

Proof.  By  theorem  3  of  Section  7  we  have,  according  to  (11), 

p(k)  =  P\  (A,)  p2  (k)  (12) 

Let  «i  be  the  dimension  of  L\.  By  Theorem  3  (i.e.,  by  the  portion 
already  proved), 

Pi  (*)  =  (-! )n,X"  03) 

Comparing  (10),  (12)  and  (13),  we  find:  ni  ^  m\.  On  the  other 
hand,  if  n,  <  mu  then  p2(X)  must  have  a  zero  root  of  multiplicity 
mx —  rt|  >  0.  But  this  is  impossible  since  the  transformation  B  is 
nonsingular  on  L2.  Thus,  nx  —  mx,  from  which  and  also  from  (10), 
(12)  and  (13)  follows  the  second  assertion  of  the  theorem. 

Remark  I.  It  is  clear  that  the  sufficiency  in  Theorem  3  is  a  spe¬ 
cial  case  of  Theorem  4  for  m  i  =  n. 

Remark  2.  Denote  by  p  the  height  of  the  transformation  B  in  Lx. 
We  have  p  ^  mx  since  the  height  of  the  transformation  does  not 
exceed  the  dimension  of  the  space.  On  the  other  hand  we  know 
that  Lt  may  be  defined  as  the  null  space  of  the  transformation  Bk 
for  any  k  ^  p.  Therefore,  if  the  multiplicity  mx  of  the  zeroth  eigen¬ 
value  is  known,  then  we  can  find  L\  as  the  null  space  of  the  trans¬ 
formation  Bm'  without  computing  p. 


§  10.  Reducing  a  transformation  matrix 
to  the  Jordan  normal  form 


I.  Definition.  We  say  that  matrix  A  has  a  Jordan  normal  form 
if  Jordan  submatrices  occupy  the  main  diagonal  with  zeros  else¬ 
where: 


A  = 


GM 


0 


Gk,  (k2) 


0 

Gkt  (h) 


(1) 


The  possibility  is  not  precluded  that  in  matrix  (1)  kt  =  kj  or 
Ai  =  Xj  for  certain  i,  /. 


JORDAN  NORMAL  FORM 


243 


§  10] 


Recall  that  every  Jordan  submatrix  is  a  fej  X  matrix  of  the 
form 


Gki  (A/)  = 


X,  1 

0 

hi 

1 

hi  1 

0 

A,/ 

A  Jordan  submatrix  of  order  one  consists  of  the  single  scalar  ht: 

G,(*/)  =  IIM.  G,  (0)  =  II 0 1| 

2.  Theorem.  In  an  n-dimensional  complex  space  L,  every  linear 
transformation  A  has  a  basis  relative  to  which  the  matrix  of  the 
transformation  has  a  Jordan  normal  form.  When  passing  to 
another  analogous  basis,  the  matrix  A  is  preserved  up  to  a  permu¬ 
tation  of  the  submatrices. 

The  basis  mentioned  in  the  theorem  will  be  called  canonical.  This 
term  conforms  with  the  terminology  of  Section  9:  the  case  con¬ 
sidered  in  Section  9  is  obtained  when  all  X,-  =  0. 

Remark.  The  proof  of  this  theorem  for  nilpotent  transformations 
is  given  in  Section  9.  The  preceding  results  permit  reducing  the 
study  of  the  general  case  to  a  consideration  of  nilpotent  transfor¬ 
mations. 

The  proof  is  given  below  together  with  auxiliary  propositions. 

3.  Auxiliary  propositions.  Given  a  linear  transformation  A  in  an 
n-dimensional  space  L.  Set 


A-aE=R 

where  a  is  a  scalar. 

Lemma  1 .  If  B  is  nilpotent  in  L  and  hence  has  a  canonical  basis, 
then,  relative  to  this  basis,  A  has  a  Jordan  normal  form  ( 1 ) ,  where 
all  Xj  =  a. 

Proof.  In  a  canonical  basis,  the  transformation  B  has  the  matrix 


0*.  (0) 

0 

Gkj  (0) 

• 

0 

• 

*  Gkt  (0) 

(2) 


24  4 


LINEAR  TRANSFORMATIONS  OF  SPACES 


[CM.  VII 


This  same  basis  elt  . .  . ,  en  is  canonical  for  A  because  of  the  iden¬ 
tity 

Gkt  (a)  =  Gki  (0)  -f  aE 


which,  written  out  in  full,  looks  like  this 


a  1 

a  1 

0 

0  1 

0  1 

0 

+  a 

1 

1 

0 

0 

a  1 

a 

0 

0  1 

0 

0 

'  1 

Because  of  (2)  and  (3),  the  matrix  of  transformation  A  relative  to 
the  basis  elt  . . . ,  en  has  the  Jordan  normal  form 


Gk,{a) 

0 

0 

Gkt  (a) 

Lemma  2.  Let  £i,  ....  Xj  be  the  roots  of  the  characteristic  poly¬ 
nomial  of  the  transformation  A.  Then  the  transformation  B  = 
—  A  —  aE  has  a  characteristic  polynomial  whose  roots  are 
£i  —  a,  Xj  —  a,  the  multiplicity  of  the  root  Xt  —  a  being  equal 

to  that  of  the  root 

Proof.  Lemma  2  follows  from  the  identity 

det  ( B  -  XE)  =  det  (,4  -  aE  -  XE)  =  p  (X  +  a) 

Lemma  3.  A  subspace  L  is  invariant  under  the  transformation 
B  =  A  —  XE  if  and  only  if  it  is  invariant  under  the  transforma¬ 
tion  A. 

Proof.  (1)  Let  £  be  invariant  under  A.  This  means  that  if  xe£, 
then  Ax  e  £  and  then 

Bx  —  Ax  —  Xx  e  £ 

(2)  If  £  is  invariant  under  B,  then  it  is  also  invariant  under  A 
since  A  =  B  — ( — X)E. 

4.  Proof  of  the  theorem.  We  now  prove  the  existence  of  a  canoni¬ 
cal  basis  for  an  arbitrary  linear  transformation  A  specified  in  an 
n-dimcnsional  complex  space  £. 


JORDAN  NORMAL  FORM 


245 


s  io] 

Let  Xi,  . . . ,  Xj  be  distinct  roots  of  the  characteristic  polynomial 
p(X)  of  the  transformation  A  so  that 

p  (X)  =  (- 1 )"  (X  -  X, )"' '  (X  -  l,)m*  ...  (X  -  X,)mi  (4) 

where  is  the  multiplicity  of  the  root  A.,  (i  =  1,2,  ....  /),  mi  + 
+  m2  +  . . .  +  mj  =  n.  Consider  the  singular  transformation  B\  — 
—  A  — X\E.  Denote  by  L\  the  null  space  of  the  transformation 

B?'  =  BiBi  ...  Bi  =  (A-XiE)"h 

and  put  B?‘(L)—L.  We  know  that 

L  =  Lt®L  (5) 

with  Bi  nilpotent  in  L\.  By  Theorem  1  of  Section  9,  there  is  a  basis 
in  L|  canonical  for  B\.  The  same  basis  is  canonical  also  for  A 
considered  in  L\  (Lt  and  L  are  invariant  under  A  by  Lemma  3). 
In  accord  with  the  expansion  (5)  we  have 

p(X)  =  pi  (X)p(X) 

where,  by  Theorem  4  of  Section  9,  p,  (A.)  ==  (— l)m'(A,  —  A,,)”11  and 
p  (X)  =  (-  (X  -  X2)m2  ...  (X-  X,)mi. 

We  now  consider  A  in  the  invairant  subspace  E  and,  arguing  as 
before,  obtain 

Z  =  L2®Z  (6) 

where  L2  is  an  invariant  subspace  of  dimension  m2  in  which  there 
exists  a  canonical  basis  for  the  transformation  B2  =  A  —  X2E  and 
also  for  the  transformation  A.  The  subspace  L2  is  defined  as  the 
null  space  of  the  transformation 

B m'  =  B2B2  ...  B2  =  (A-  X2Ef *  (7) 

considered  in  L.  However,  if  we  consider  the  transformation  (7) 
throughout  the  space  L,  its  null  space  should  contain  L2  and 
have  the  same  dimension  m2.  Therefore  L2  is  the  null  space  of  (7) 
considered  throughout  the  space  L.  From  (5)  and  (6)  we  get 
L  =  Li  ®  L2  ©  L. 

Continuing  this  process,  we  arrive  after  the  /th  step  at  the 
expansion 

L  =  Li@L,®  ...  ©Z.;  (8) 

where  Lt-  is  the  null  space  of  the  transformation B™‘  —  (/4  — 
the  dimension  of  L{  is  equal  to  m,.  The  transformation  A  has  a 
canonical  basis  in  each  of  the  L,.  The  union  of  these  bases  yields 
the  desired  basis  of  the  entire  space  L. 


24fi  LINEAR  TRANSFORMATIONS  OF  SPACES  [CH.  VII 

Remark.  The  foregoing  proof  indicates  a  procedure  for  actually 
reducing  A  to  the  Jordan  normal  form  and  indicates  a  method  for 
finding  a  canonical  basis.  But  it  is  also  possible  to  write  the  Jordan 
form  of  the  transformation  A  without  constructing  a  canonical 
basis.  This  possibility  is  a  consequence  of  the  following  subsec¬ 
tion. 

5.  We  here  prove  the  unique  definiteness  of  the  Jordan  normal 
form  of  the  matrix  of  the  given  transformation.  Let  the  canonical 
basis  e\,  . . . ,  en  be  found.  In  it  the  matrix  of  the  transformation  A 
has  the  form  (1).  Suppose  that  the  Jordan  submatrices  correspond¬ 
ing  to  Xi  are  located  in  the  first  r  rows  of  matrix  (1).  Then  the 
linear  hull  of  the  first  r  basis  vectors  forms  a  subspace  L\  in  ex¬ 
pansions  of  the  form  (5)  and  (8): 

L\  =  L  (£],  . . . ,  er),  L  =  L  (er+i,  . . . ,  <?„) ==  £2©  ...  ©  £/ 

The  transformation  B{  =  A  — X\E  is  nilpotent  in  L\.  The  basis 

e\ . eT  of  subspace  L\  is  canonical  for  B\.  The  number  and 

length  of  the  sequences  of  this  basis  (relative  to  £|)  are  equal  to 
the  number  and  dimensions  of  the  Jordan  submatrices  correspond¬ 
ing  to  the  number  Xi  in  the  canonical  matrix  of  the  transforma¬ 
tion  A.  The  number  of  sequences  of  different  length  in  the  ca¬ 
nonical  basis  of  a  nilpotent  transformation  is  determined  from 
formula  (9)  of  Section  9.  As  applied  to  this  case,  the  quantities 
rh  are  equal  to  the  ranks  of  the  transformations  Bk  considered 
in  L\,  or,  what  is  the  same  thing,  to  the  dimensions  of  the  sub¬ 
spaces  B*(L  1).  We  will  show  that  in  (9),  Section  9,  we  can  put  in 
place  of  rft  the  ranks  of  the  transformations  flf  considered  in  the 
entire  space  L.  Let  Rh  be  the  rank  of  Bk\  in  the  space  L.  Then  Rh  is 
equal  to  the  dimension  of  Bk  (L). 

Since  B\  is  nonsingular  in  subspace  £,  we  have 

L  =  B,(L)  =  B\(L)=  ...  =fl‘(£) 

By  Subsection  G,  Section  4,  we  find 

Bf(Z.)  =  flf(Z.i©L)  =  Bf(£,)©B?(L)=  B?  (£,)©£  (9) 

Denote  by  s  the  dimension  of  L.  From  (9)  we  get 

Rk  =  rk  +  s 


so  that  (since  s  is  independent  of  k )  we  have 

^4-1  —  2/?*  +  Rk+ 1  —  rk~\  2r  h  -f-  r  4+1 


(the  right  member  of  (10)  enters  into  (9),  Section  9). 


(10) 


JORDAN  NORMAL  FORM 


247 


S  10] 

Similar  arguments  apply  to  the  other  singular  transformations 
B{  =  A  —XtE. 

Conclusion.  Let  llk  be  the  number  of  Jordan  submatrices  of  di¬ 
mension  k  ^  1,  corresponding  to  the  eigenvalue  Xu  in  the  matrix 
of  the  given  transformation  A  written  in  the  canonical  basis.  Then 

=  rank  (4  —  XiE)k~'  —  2  rank  {A  —  X,E)k  +  rank  (A  —  A,,£)t+I  (11) 

All  the  terms  in  the  right  member  of  (11)  are  independent  of  the 
choice  of  basis.  The  proof  of  the  theorem  of  Subsection  2  is  com¬ 
plete. 

6.  Let  a  canonical  basis  be  found.  Then  each  of  the  subspaces  Z., 
of  (8)  is  represented  in  the  form  of  a  direct  sum  of  invariant  sub¬ 
spaces,  in  each  of  which  the  transformation  is  specified  by  a  single 
Jordan  submatrix.  Polynomials  of  the  form  (X —  X,)\  which,  up 
to  sign,  are  equal  to  the  characteristic  polynomials  of  these  Jordan 
submatrices,  are  called  elementary  divisors  of  the  matrix  of  the 
transformation  A\  k  is  the  order  of  the  Jordan  submatrix. 

The  number  of  elementary  divisors  of  a  given  power  k  with  a 
given  eigenvalue  Xi  is  equal  to  the  number  llk  (see  (11)).  In  matrix 
theory  it  is  demonstrated  that  elementary  divisors  can  be  computed 
if  one  knows  the  greatest  common  divisors  of  order-s  minors  of 

matrix  A  — XE  for  s  —  1 . n.  We  thus  have  another  method  for 

finding  the  Jordan  normal  form  of  matrix  A. 

7.  The  theorem  of  Subsection  2  is  true  in  real  space  on  the  as¬ 
sumption  that  all  roots  of  p(X)  are  real.  The  proof  is  that  given 
in  Subsections  3  to  5. 

8.  If  in  a  real  space  L  a  transformation  A  is  given  in  which 
certain  roots  of  the  characteristic  polynomial  are  complex,  then  in 
place  of  (4)  we  have 

p(X)  =  (-\)n-m(X-Xl)m'  ...  (X-Xk)mkp(X)  (12) 

where  p(X)  is  a  polynomial  of  degree  m  devoid  of  real  roots, 
m\ mk m  =  n.  By  (12)  we  get  an  expansion  of  the 
space  L  into  invariant  subspaces:  L  —  L\  ©  . . .  ©  Lh  ©  £,  where 
Li  is  a  subspace  of  dimension  m*  (the  null  space  of  the  transforma¬ 
tion  (A  —  XiE)mi),  in  which  A  has  one  eigenvalue  Xt  and  L  is  a 
subspace  of  dimension  m,  in  which  A  is  nonsingular  and  does  not 
have  a  single  eigenvector.  A  canonical  basis  can  be  chosen  in  each 
one  of  the  L,-.  In  L  we  choose  a  basis  at  pleasure.  Then  the  mat- 


248 


LINEAR  TRANSFORMATIONS  OF  SPACES 


[CH.  VII 


rix  A  becomes 


G<» 

0 

0 

G‘*> 

A 

where  G<‘),  . . . ,  G<*>  are  Jordan  normal  forms  of  the  matrices  of  the 
transformation  A  in  the  subspaces  L\,  . . . ,  £>,  and  A  is  a  nonsin¬ 
gular  m  by  m  matrix.  The  question  of  further  simplifying  the 
matrix  A  (via  a  special  choice  of  basis  in  £  that  simplifies  the  sub¬ 
matrix  A)  will  not  be  considered. 

§  11.  Transformations  of  a  simple  structure 

1.  Definition.  A  linear  transformation  A  in  space  L  is  called  a 
transformation  of  a  simple  structure  if  there  is  a  basis  in  L  consist¬ 
ing  of  the  eigenvectors  of  that  transformation. 

In  the  case  of  a  transformation  of  a  simple  structure,  the  Jordan 
normal  form  of  the  matrix  consists  of  one-dimensional  Jordan  sub¬ 
matrices.  Actually  we  have  already  dealt  with  transformations  of 
a  simple  structure  in  Subsection  5  of  Section  7. 

We  now  give  two  criteria  for  the  existence  of  a  basis  of  eigen¬ 
vectors. 

2.  First  criterion  (sufficient).//  the  characteristic  polynomial  of 
a  linear  transformation  A  of  a  complex  space  L  does  not  have  any 
multiple  roots,  then  there  is  a  basis  in  L  made  up  of  eigenvectors 
of  A. 

Proof.  Under  the  conditions  of  the  criterion,  the  expansion  (8) 
of  Section  10  consists  of  n  distinct  one-dimensional  invariant  sub- 
spaccs  /. i . £„.  Here,  each  £,  is  a  linear  hull  of  the  eigenvec¬ 

tor  <\.  By  Section  14,  Chapter  I,  the  vectors  et,  ... ,  e„  form  a  basis 
in  L. 

3.  Second  criterion  (necessary  and  sufficient).  In  a  complex 
space  L  there  exists  a  basis  of  eigenvectors  of  the  transformation  A 
if  and  only  if  for  each  root  A,,-  of  the  characteristic  polynomial  p(k) 
the  rank  of  the  matrix  A  — A.,/;  is  equal  to  the  difference  n  — mit 
where  m ,  is  the  multiplicity  of  this  root  and  n  is  the  dimension 
of  L. 

Proof.  (I)  Necessity.  Let  there  be  a  basis  of  eigenvectors.  Rela¬ 
tive  to  this  basis,  the  matrix  of  the  transformation  A  is  diagonal 


TRANSFORMATIONS  OF  A  SIMPLE  STRUCTURE 


249 


I  HI 

(see  Section  7,  formula  (3))  and  the  characteristic  matrix  is  of  the 
form 

A,,  —  A,  Q 

A^  —  A 

A-XE=  (1) 


so  that 

p(A,)  =  det(4-A£)  =  (A.,  -A)(A,-A)  ...  (A„  -  A) 

If,  for  example,  A,,  is  of  multiplicity  m,,  that  is, 

Al  —  A.2  =  ...  =  Am,, 

Am,  +  I  A|,  .  .  . ,  An  A| 

then  for  A  =  Ai  the  first  m.\  elements  are  equal  to  zero  on  the  dia¬ 
gonal  of  matrix  (1)  while  the  remaining  elements  are  nonzero, 
and  so 

rank  (/l  —  A|£)  =  ra  —  m,  (2) 

Because  of  the  invariance  of  p( A)  and  of  the  rank  of  the  charac¬ 
teristic  matrix,  (2)  is  independent  of  any  choice  of  basis. 

(2)  Sufficiency.  By  Subsection  4,  Section  10,  the  dimension  of 
the  subspace  L*  is  equal  to  the  multiplicity  mt  of  the  root  A,-.  If 

rank  (,4  —  A,£)  =  n  —  mi  (3) 

then  to  the  eigenvalue  A;  correspond  linearly  independent  eigen¬ 
vectors  (see  Section  6,  Subsection  2).  They  all  lie  in  the  subspace 
Li  and  form  a  basis  there.  If  (3)  holds  for  every  i,  then  the  union 

of  such  bases  for  all  i  =  1 . j  yields  a  basis  of  the  space  L 

(see  Section  14,  Chapter  I),  which  basis  consists  of  eigenvectors. 
This  completes  the  proof  of  the  second  criterion. 

Remark.  Under  the  conditions  of  the  second  criterion,  the  trans¬ 
formation  A  acts  in  each  of  the  £,■  like  a  similarity  transformation 
with  coefficient  A ,•  (in  this  connection,  see  Section  7,  Subsection  4). 

4.  Both  criteria  hold  true  in  real  space  under  the  supplementary 
condition  that  all  roots  of  the  characteristic  polynomial  be  real. 

The  proof  is  left  to  the  reader. 

5.  From  the  results  of  Section  10  it  follows  that  an  arbitrary 
linear  transformation  A  in  a  complex  linear  space  L  (and  also  in 
real  space,  provided  that  p( A)  has  only  real  roots)  may  be  given 
in  the  form  of  the  sum 

A  =  B  +  C 

where  £  is  a  nilpotent  transformation  and  C  is  a  transformation 
of  a  simple  structure  (see  (1)  and  (2)  in  Section  10). 


250 


LINEAR  TRANSFORMATIONS  OF  SPACES 


[CH.  VU 


§  12.  Equivalence  of  matrices 

1.  Definition.  Two  n  X  n  matrices  A  and  B  are  said  to  be  equi¬ 
valent  if  there  exists  a  nonsingular  nX«  matrix  Q  such  that 
B  —  QAQ ■'  (the  matrices  A,  B,  and  Q  are  either  all  real  or  all 
complex). 

The  geometric  significance  of  this  definition  consists  in  the  fol¬ 
lowing:  if  A  is  regarded  as  the  matrix  of  some  linear  transforma¬ 
tion  relative  to  an  arbitrarily  chosen  basis  e\,  . . . ,  en,  then  B  spe¬ 
cifies  the  very  same  transformation  relative  to  another  basis 

e, . ,en,  with  Q  =(P*)~l,  where  P  is  the  matrix  of  the  right 

members  of  the  formulas 

er  =  Z  P'rei 

(see  (la)  in  Section  2). 

It  is  readily  verified  that  if  A  is  equivalent  to  B,  then  B  is  equi¬ 
valent  to  A,  and  that  two  matrices  which  are  separately  equivalent 
to  a  third  are  equivalent.  Thus,  the  entire  collection  of  matrices 
(real  or  complex)  breaks  down  into  disjoint  classes  of  equivalent 
matrices. 

2.  Let  H  be  a  subgroup  of  matrices.  If  in  defining  equivalence 
we  take  matrices  Q  and  H,  we  get  matrices  that  are  equivalent 
relative  to  the  given  subgroup  H.  Distinct  matrices  equivalent  to 
a  given  one  relative  to  H  express  one  and  the  same  linear  trans¬ 
formation  with  respect  to  distinct  bases  belonging  to  one  class  of 
bases  relative  to  the  subgroup  H. 

3.  From  the  results  of  Section  10,  it  follows  that  for  every  com¬ 
plex  n  X  n  matrix  A  there  exists  an  equivalent  matrix  G  having  a 
Jordan  normal  form.  A  permutation  of  the  submatrices  in  G  car¬ 
ries  G  into  an  equivalent  matrix  G',  since  a  transition  from  G 
to  G'  is  associated,  geometrically,  with  a  permutation  of  certain 
sets  of  vectors  relative  to  a  given  basis.  The  process  of  finding 
a  matrix  G  equivalent  to  A  is  called  reducing  A  to  the  Jordan  nor¬ 
mal  form. 

Two  matrices,  the  Jordan  normal  forms  of  which  differ  in  eigen¬ 
values,  number  or  dimensions  of  the  Jordan  submatrices,  are  nfct 
equivalent. 

4.  If  for  a  given  matrix  A  we  know  the  Jordan  normal  form  G, 
then  Q  in  the  equation 

G  =  QAQ-'  (1) 

can  be  found  in  the  following  manner.  Postmultiply  both  sides  of 
(1)  by  Q  and  transpose  all  terms  to  one  side  to  get 

GQ  —  QA  =  0 


(2) 


EQUIVALENCE  OF  MATRICES 


251 


S  121 


which  may  be  viewed  as  a  homogeneous  system  of  linear  equations 
in  which  the  unknowns  are  elements  of  Q.  Any  solution  of  such 
a  system  that  satisfies  the  supplementary  condition 

detQ^O  (3) 

yields  the  desired  matrix  Q.  This  method  is  very  cumbersome  for 
large  n  since  (2)  contains  n2  equations. 

Other  methods  have  been  elaborated  for  finding  the  matrices  G 
and  Q  from  a  given  matrix  A.  One  of  them  was  actually  described 
in  Sections  9  and  10  (also,  see,  for  example  [8]). 


5.  Examp 
4  1 

-1  6 
Solution. 


e.  Reduce  to  Jordan  normal  form  the  matrix  A  = 


:orm  the  characteristic  polynomial: 

4  —  1  1 

-1  6  —  K 


pW  = 


=  A,2-  10X  +  25 


whence  A,.  ==  A2  =  5.  We  then  find 


A-hE  = 


-1  1 

-1  1 


,  rank  (A  —  k{E)  =  1 


The  sum  of  the  multiplicity  of  the  root  Xi  and  of  the  rank  of 
( A  —  Xi E)  exceeds  n  =  2  and  so  there  is  no  basis  of  eigenvectors 
for  a  transformation  with  matrix  A.  In  such  a  situation  (n  —  2,  Ai 
is  a  multiple  root,  there  is  no  basis  of  eigenvectors),  there  is  only 
one  possibility  for  a  Jordan  normal  form:  a  two-dimensional  Jordan 
submatrix  corresponding  to  the  given  root  X|  =  5: 


G  — 


5  1 
0  5 


(4) 


We  can  reason  differently.  It  is  easy  to  compute  that 


(A-A,£)2  =  0 


and  so  the  ranks  of  successive  nonnegative  powers  of  the  matrix 
(A  —  Xi£)  form  the  sequence 

r0  =  2,  r,  =  1 ,  r2  —  r3—  ...  =  0 

Putting  T{  in  (11),  Section  10,  we  find  that  the  number  of  one¬ 
dimensional  Jordan  submatrices  in  matrix  G  is  zero,  the  number 
Of  two-dimensional  ones  equals  unity,  in  agreement  with  (4). 


252 


LINEAR  TRANSFORMATIONS  OF  SPACES 


[CH.  VII 


Mow 

let  us 

Qn 

Q12 

Q21 

Ql2 

into  (2),  we  get  the  system  of  equations 


Qn  +  Q12  +  Q21  — 0, 

—  Qll  —  Ql2  -|-Q,2  =  0,  I 
Q21  4-  Q< 2  —  0,  r 
Qh  Q  i2  ==  0 


Q— 


(5) 


(All  indices  are  written  as  subscripts  since  the  tensorial  nature  of 
the  formulas  is  immaterial.)  The  last  two  equations  of  (5)  are  a 
consequence  of  the  first  two,  from  which  we  find 

Qn=a,  Q12 =  b, 

Q21  —  —  o  —  b,  Q  22  —  a  “j-  b 


where  a,  b  are  arbitrary  scalars.  We  have  to  ensure  the  condi- 
tion  (3): 


detQ  = 


a 

—  a  —  b 


b 

a  +  b 


=  (a  +  b)2¥;  0 


whence  a  — b.  No  other  restrictions  are  imposed  on  a  and  b. 

For  instance,  taking  a  =  1  and  b  —  0,  we  get 


Q  — 


0 

1 


It  is  easy  to  verify  that  (1)  holds  true. 


§  13.  The  Hamilton-Cayley  formula 

1.  A  direct  consequence  of  Section  10  is  an  identity  known  as  the 
Hamilton-Cayley  formula. 

Let 


[1  (k)  =  (—  I )"  (A"  4  Pi^'  '+  •••  +  Pn-\^  +  Pn)  * 

be  the  characteristic  polynomial  of  the  linear  transformation  A. 
Then  p(A)  is  the  zero  linear  transformation.  Written  out, 

An  +  PiAn~' +  ...  +Pn-lA  +  PnE  =  e  (1) 

2.  Proof.  Here  the  space  will  be  taken  to  be  complex.  By  Sub¬ 
section  4,  Section  10,  we  have 

^(^)  =  (— l)n ... 


(2) 


THE  HAMII.TON-CAYLEY  FORMULA 


253 


S  13] 

Accordingly 

P  (A)  =  (—1)"  (A  —  L{E)m'  ...  (A  -l,E)m  I  (3) 

which  is  easy  to  see  if  we  simultaneously  multiply  together  the  pa¬ 
rentheses  in  the  right  members  of  (2)  and  (3).  Using  the  notation 
of  Subsection  4,  Section  10,  we  can  write,  in  place  of  (3), 

p  (A)  =  (— 1)"  B”i  (4) 

The  order  of  the  factors  in  (3)  and  (4)  is  immaterial  because  here 
we  have  only  products  of  the  operators  A  and  E,  which  are  com¬ 
mutative. 

Let  x  be  an  arbitrary  vector  in  L.  Since  L  is  the  sum  of  the  L,-, 
we  can  write  the  expansion 

x  =  xt  +  ...  +Jt/,  where  xt  e  Llt  j=l,  ...,  /  (5) 

On  the  other  hand,  by  the  definition  of  fi,  and  Li  we  have 

(6) 

It  is  evident  now  that 

p(A)x  =  e 

due  to  (4) - (6) ,  since  in  (4)  the  factors  can  be  written  in  any  order. 
Formula  (1)  is  thus  proved. 

The  Hamilton-Cayley  formula  holds  not  only  in  complex  space 
but  in  real  space  as  well  since  a  real  space  can  always  be  extended 
to  form  a  complex  space.  More  explicitly,  if  we  have  a  given  basis, 
we  can  permit  a  consideration  of  vectors  with  complex  compo¬ 
nents  (coordinates);  then  a  linear  transformation  A  naturally  ex¬ 
tends  to  the  resulting  complex  space  (its  matrix  A  must  be  left  un¬ 
changed). 


Chapter  VIII 


SPACES  WITH  QUADRATIC  METRIC 


§  I.  Scalar  products 

1.  Let  L  be  a  real  linear  space.  In  L  we  introduce  a  new  opera¬ 
tion  called  the  scalar  multiplication  of  vectors. 

Scalar  multiplication  assigns  to  each  pair  of  vectors  x,  y  of  L 
a  real  number  denoted  by  ( x ,  y)  and  called  the  scalar  product  ol 
vector  x  by  vector  y. 

By  analogy  with  elementary  analytic  geometry  we  require  that 
the  following  properties  hold  true: 

(1)  commutativity,  (x,  «/)  =  («/,  x); 

(2)  distributivity,  (x,  -f  x2,  y)  =  (x,,  y)  -f  (x2,  y) ; 

(3)  homogeneity,  (ax,  y)=  a(x,  y)  for  every  real  number  (sca¬ 
lar)  a; 

(4)  nonsingularity,  if  (x,  y)  =  0  for  a  fixed  x  and  any  y  in  L, 
then  x  =  0. 

Here  x,  y,  Xj,  x2  are  always  arbitrary  vectors  of  space  L. 

2.  Notice  that  in  elementary  analytic  geometry  these  properties 
of  a  scalar  product  are  proved  as  theorems,  while  we  regard  them 
as  axioms  and  include  them  in  the  definition  of  a  scalar  product. 

3.  The  second  and  third  properties  together  signify  linearity  of 

the  scalar  product  in  the  first  argument.  Because  of  commutativity, 
we  have  linearity  in  the  second  argument  as  well.  „ 

Thus,  a  scalar  product  (x,  y)  is  a  bilinear  form  which  is  sym¬ 
metric  by  the  first  property  and  nonsingular  by  the  fourth  pro¬ 
perty.  Indeed,  the  fourth  property  means  that  the  zero  subspace 
of  a  bilinear  form  (x,  y)  is  zero-dimensional,  whence  follows  its 
nonsingularity  (see  Section  11  of  Chapter  IV). 

4.  Clearly,  the  converse  is  true  as  well. 

Every  nonsingular  symmetric  bilinear  form  g(x,  y)  specified  in 
a  space  L  may  be  taken  for  a  scalar  product  by  putting 

(x,  y)  =  g(x,  y) 


for  any  x,  y  e=  L. 


SCALAR  PRODUCTS 


255 


$  1} 

Remark.  Naturally,  a  scalar  product  depends  on  the  choice  of 
the  form  g(x,  y).  If  we  choose  different  forms  for  the  scalar  pro¬ 
duct,  then  for  a  given  pair  of  vectors  x,  y  in  L  the  scalar  product 
will  in  general  receive  different  numerical  values. 

5.  Let  a  scalar  product  (.v,  y)  —  g(x,  y)  be  introduced  in  a 
space  L. 

Assuming  the  space  to  be  n-dimensional,  we  take  an  arbitrary 
basis  e,,  ....  en.  If  x—  £  x‘eh  //  =  £  ykek,  then  the  scalar  pro¬ 
duct  will  be  written  in  terms  of  components  (coordinates)  as  fol¬ 
lows: 

(x,  y)  =  g  (X,  y)  =  £  gikXlyk  ( 1 ) 

where  gih  are  the  coefficients  of  the  bilinear  form  g(x,  y)  relative 
to  the  given  basis  eit  ....  e„.  They  are  the  values  of  the  form  on 
the  basis  vectors.  Thus 

(<?/.  ek)  =  glk  (2) 

and  =  gftj.  The  equations  (2)  constitute  a  multiplication  table 
of  the  basis  vectors. 

If  the  right  members  of  (2)  are  given,  then  the  scalar  product 
of  any  pair  of  vectors  x,  y  is  uniquely  determined  (according 
to  (1)). 

6.  Definition  1.  The  vectors  x,  y  are  called  orthogonal  if 
(*,  y)  =  0. 

In  terms  of  components,  the  orthogonality  condition  of  the  vec¬ 
tors  x,  y  is  of  the  form 

Z  gikx‘xk  =  0 

Definition  2.  Vector  x  is  orthogonal  to  a  subspace  L'  if  (x,  y)  =0 
for  any  y  e  U . 

Note  that  if  L'  has  dimension  k,  then  for  the  orthogonality  of  x 
to  the  subspace  L'  it  is  sufficient  that  x  be  orthogonal  to  any  k  in¬ 
dependent  vectors  lying  in  L' .  Indeed,  if  the  independent  vectors 

ai,  ....  as  lie  in  L'  and  if  (x,  ai)  =  0 . (x,ah)  =  0,  then  for 

any  yei  we  have  y  =  X'ai  -f  . . .  -j-  kha,„  whence 

(x,  y)  =  A'  (x,  a,)  +  ...  -f  A.*  (x,  ak )  =  0 

Definition  3.  Subspaces  L\  L”  arc  said  to  be  orthogonal  if 
(. x ,  y)  =  0  for  any  rst'  and  any  y  e  L" . 

Definition  4.  Subspace  L"  is  called  the  orthogonal  complement 
of  subspace  L'  in  L  if  L'  and  L"  are  orthogonal  and  their  direct 
sum  coincides  with  L, 


SPACES  WITH  QUADRATIC  METRIC 


(CH.  VIII 


25r> 


Remark.  It  is  to  be  stressed  that  the  orthogonality  of  vectors 
and  the  orthogonality  of  subspaces  depends  essentially  on  precisely 
which  bilinear  form  g(x,  y)  is  taken  as  the  scalar  product  (x,  y) 
in  space  L. 

§  2.  The  norm  of  a  vector 

1.  Given  a  scalar  product  in  a  linear  space  L. 

Definition.  The  norm  of  a  vector  x  is  the  number 

II  x  ||  =  +  V (*»  x)  (1) 

The  norm  is  a  generalization  of  the  concept  of  the  modulus  (ab¬ 
solute  value)  or  length  of  a  vector  known  from  elementary  geo¬ 
metry. 

The  scalar  product  (x,  x)  is  a  real  number  but  it  may  not  be  po¬ 
sitive,  so  that  the  norm  of  a  vector  may  prove  to  be  imaginary. 
We  make  the  convention  that  the  radical  in  (1)  can  be  either  a 
nonnegative  real  number  or  an  imaginary  number  having  a  posi¬ 
tive  multiplier  with  i  (i  —  +  V~  0- 

2.  From  the  definition  of  a  norm  it  follows  that 

||  a*  ||  =  |  a  |  •  ||  *  || 

for  any  x  e  L  and  any  scalar  a. 

In  particular 

ll-*ll  =  IUH,  110 1|=0  (2) 

Nonzero  vectors  whose  norm  is  equal  to  zero  are  called  isotropic. 
Isotropic  vectors  exist  if  and  only  if  the  quadratic  form  (x,  x)  is 
not  of  fixed  sign. 

3.  The  quadratic  form  II  x  ||2  =  (x,  x)  is  called  the  metric  form 
of  the  space  under  consideration. 

It  is  determined  by  the  bilinear  form  ( x ,  y)  and  in  turn"defines 
it  as  its  polar  form.  Thus,  specification  of  a  scalar  product  and  spe¬ 
cification  of  a  quadratic  form  are  equivalent  as  far  as  measuring 
the  norm  of  vectors  goes.  For  this  reason,  spaces  with  a  given 
scalar  product  are  also  called  spaces  with  a  quadratic  metric,  or 
inner-product  spaces. 

If  the  space  is  n-dimensional,  then  the  metric  form  expressed  in 
terms  of  components  is 

||  x  ||2  =  (x,  x)  =  Z  gikXlxk 

4.  Theorem.  If  the  metric  form  is  positive  definite,  then  for  any 
two  vectors  x,  y  e  L  we  have  the  inequality 

IU  +  I/IKIUII  +  lli/11  (3) 


THE  NORM  OF  A  VECTOR 


25? 


§  2] 

Proof.  Take  advantage  of  the  Cauchy-Bunyakovsky  inequality 
(see  Section  10,  Chapter  IV) 

(x,  y)2^(x,  x)-(y,  y)  (4) 

Taking  into  account  (4),  we  get 
II  *  +  y  II2  =  (x  +  y,  x  + ;/)  =  (x,  x)  +  2  (x,  y)  +  (y,  y) 

<  (x,  x)  4-  2  V(x,  x)  •  (y,  y)  +  (//,  y)  =  (II  x  ||  + 1|  y  ||  )2 
whence  follows  (3). 

Remark.  From  (3)  it  follows  that  if  the  metric  form  is  positive 
definite,  then 

IU-«/ll>IUI|-||*/||,  II x  +  t/ II > II x H- 1| «/ 1| 

5.  Let  us  consider  an  affine  space  21  to  which  corresponds  a 
linear  space  L  with  quadratic  metric. 

For  each  pair  of  points  A,  B  in  21  we  define  the  distance  p (A,  B) 
and  assume  it  to  be  equal  to  the  norm  of  the  vector  AB: 

p  (A,  B)  — 1|  AS  ||  (5) 

We  have 

9(A,  B)  =  p(B,  A),  p(i4,  /1)  =  0  (6) 

Formulas  (6)  follow  from  (2)  and  (5). 

6.  In  the  case  of  a  positive  definite  metric  form  (x,  x),  the  di¬ 
stance  between  points  is  zero  if  and  only  if  the  points  are  coinci¬ 
dent  and,  besides,  for  any  three  points  A,  B,  C  in  21  the  triangle 
inequality  holds: 

p(4,C)<p(4,  B)  +  p(fl,  C)  (7) 

Inequality  (7)  follows  from  inequality  (3)  and  formula  (5). 

7.  If  the  distance  between  points  of  the  affine  space  21  is  defined 
by  formula  (5),  then  we  say  that  a  quadratic  metric  is  specified  in 
the  affine  space  21.  Expressed  in  terms  of  affine  coordinates,  the 
square  of  the  distance  is 

p2  (A,  B)  =  £  glk  (x'  -  *{)  (x*  —  xf)  (8) 

where  xj,  ...,  xf  are  the  affine  coordinates  of  the  point  A  and 
x'2,  ....  xf  are  the  affine  coordinates  of  the  point  B. 

The  right  member  of  (8),  which  is  quadratic  with  respect  to  the 
differences  of  the  coordinates  of  the  arbitrary  points  A  and  B,  is 
called  the  metric  form  of  the  space  21. 


9-661 


258 


SPACES  WITH  QUADRATIC  METRIC 


[CII.  VIII 


§  3.  Orthonormal  bases 

1.  The  bases  in  a  quadratic-metric  space  are  not  of  the  same 
status.  They  include  some  that  are  most  convenient  from  the  view¬ 
point  of  the  given  metric. 

I'or  instance,  the  basis  e\,  ...,  en  can  be  chosen  so  that  the 
metric  form  g( x,  x)  is  normal  relative  to  this  basis: 

II -v IP  =  g (x,  x)  =  (x‘)2+  ...  +  (xk)2  —  (xk+')2  —  ...  —(xn)2 

Then  the  scalar  product  of  two  vectors  can  be  represented  thus; 
xy  =  xhf  -f-  ...  +  xkyk  — ■  xk+>yk+l  —  ...  —  xnyn 

It  is  clear  that  the  scalar  products  ( eu  e,)—  0  if  i  =#=  /,  that  is, 
for  i  #  /  the  basis  vectors  are  orthogonal.  Here,  ||  et  ||2  =  1  if 

i  —  1 . k\  ||  et  ||2  =  —  1  if  t  =  k  4-  1,  . . . ,  n.  Thus  the  vectors 

of  the  basis  are  normalized  so  that  the  squares  of  their  norms  are 
in  absolute  value  equal  to  unity.  The  vectors  e{  are  called  unit  vec¬ 
tors  if  i  ss:  k,  and  imaginary-unit  vectors  if  i  k  -f-  1.  Generally, 
a  vector  a  is  called  a  unit  vector  if  ||  a  ||2  =  1  and  an  imaginary- 
unit  vector  if  ||  a  ||2  =  — 1. 

Definition.  A  basis  et,  ...,  en  that  satisfies  the  conditions  enu¬ 
merated  in  this  subsection  is  said  to  be  orthonormal. 

Theorem  1 .  In  an  n-dimensional  linear  space  with  a  given  quad - 
ratic  metric,  any  selection  of  n  pairwise  orthogonal  unit  or  ima¬ 
ginary-unit  vectors  is  a  basis  in  which  the  metric  form  is  normal. 

Proof.  Let  eu  ....  en  be  such  an  indicated  selection  of  vectors. 
Let  us  be  sure  that  they  are  linearly  independent.  We  consider  the 
relation 

K{e{  +  A2e2  +  . . .  +  Ken  —  9 

whence,  forming  the  scalar  product  by  means  of  e\,  we  get 

A-i  (^i,  ei)-+-A2(e2,  ef)  +  ...  +A,„(e„,  =  (0,  e\) 

But  by  hypothesis,  (eu  ei)==±l,  (ej,  £|)=0  (j¥=l)',  besides, 
(0,  t’i)  =  0.  Hence  —  0.  Similarly  we  prove  that  X2  =  0 . 

=0.  We  thus  establish  that  the  vectors  elt  ...,  en  are  indepen¬ 
dent  and,  hence,  do  actually  constitute  a  basis. 

Since  g(eu  <?,)  =  ( eu  a)  =  ±  1,  g(eit  ej)  =  {eu  e})  =  0,  the  form 
g(x,  x)  is  normal  relative  to  the  basis  e,,  ... ,  e„. 

2.  Along  with  Theorem  1,  we  note  the  following  assertion. 

It  is  always  possible,  in  n-dimensional  linear  space,  to  specify 
(in  unique  fashion)  a  quadratic  metric  such  that  an  arbitrary 
preassigned  basis  e,,  . . . ,  ek,  en+ 1,  . . . ,  en  will  be  orthonormal,  the 
vectors  e, . eu  will  be  unit  vectors,  and  the  vectors  eh+i . en 


§  4]  ORTHOGONAL  PROJECTION.  ORTHOGONALIZATION  259 

will  be  imaginary-unit  vectors.  Here,  k  is  also  any  preassigned  in¬ 
teger  from  0  to  n. 

Proof.  The  desired  metric  is  uniquely  determined  by  specifying 
the  metric  form  g(;c,  jc),  which,  relative  to  the  basis  eu 
£*+1,  -  ■  ■ ,  en,  is 

g  (x,  x)  =  (jc1)2  4-  •  •  •  +  (**)2  -  (x»  (jc")2 

3.  By  the  law  of  inertia  of  quadratic  forms,  the  number  of  unit 
and  the  number  of  imaginary-unit  vectors  is  independent  of  the 
choice  of  basis  that  is  orthonormal  in  the  given  quadratic  metric. 

Definition.  The  number  k  of  unit  vectors  of  an  orthonormal  ba¬ 
sis  is  called  the  positive  index  of  the  space  with  given  quadratic 
metric. 

If  k  =  n  or  if  k  =  0,  the  space  is  called  Euclidean. 

If  1  ^  k  ^  n—  1,  the  space  is  called  quasi-Euclidean. 

Of  particular  importance  is  the  quasi-Euclidean  space  when  k  — 
=  n — 1.  It  is  called  Minkowski  space  and  for  n  —  4  plays  an  im¬ 
portant  role  in  relativity  theory. 

§  4.  Orthogonal  projection.  Orthogonalization 

1.  In  this  section  we  consider  a  Euclidean  space  L,  that  is,  a 
linear  space  with  a  metric  form  of  fixed  sign.  We  consider  the 
metric  form  to  be  positive  definite.  (The  case  of  a  negative  definite 
metric  form  does  not  require  separate  consideration.  This  will  be 
clear  from  Section  5.)  The  space  L  may  be  infinite-dimensional. 

Let  a  subspace  L'  be  given  in  L.  Suppose  that  a  vector  re  L  is 
given  as  the  sum 

x  =  x'  +  x  (1) 

where  r'eL'  and  x  is  orthogonal  to  L'.  Then  the  vector  x'  is  said 
to  be  the  orthogonal  projection  of  the  vector  x  on  the  subspace  L'. 
The  orthogonal  projection  of  x  on  L'  is  unique.  Indeed,  suppose 
there  is  another  expansion  x  =  x'l-\-Xl,  where  x\  e  L',  and  jc, 
is  orthogonal  to  L'.  Then  x'  —  x\  —  jc,  — •  jc,  whence 

(jc'  —  x\y  =  (jc'  —  x'  —  *')  =  (jc,  —  Jc,  x'  —  jc')  =  0  (a) 

since  x'  —  x'^L'  and  jc  and  xt  are  orthogonal  to  L'.  From  (a) 
it  follows  that  x' —  x\  —  0,  or  xt  —  x\,  since  the  metric  form  of 
the  space  is  positive  definite, 

The  special  case  where  L  is  three-dimensional  and  L'  is  two-di¬ 
mensional  is  shown  in  Fig.  36. 

The  transformation  of  space  L  which  assigns  to  each  vector  x 
a  corresponding  vector  x'  by  formula  (1)  is  also  called  an  ortho¬ 
gonal  projection  on  L', 


a* 


SPACES  WITH  QUADRATIC  METRIC 


[CH.  VIII 


2fiO 


If  L  is  viewed  as  a  point  space  and  L'  as  a  plane  in  that  space, 
then  the  point  M'  with  radius  vector  OM'  —  x'  is  the  orthogonal 
projection  on  L  of  point  M  with  radius  vector  OM  —  x  (Fig.  36). 

2.  We  now  demonstrate  that  the  orthogonal  projection  M'  of 
point  M  on  L'  is  the  point  in  L'  closest  to  M. 

Let  y  =  ON  be  an  arbitrary  vector  of  the  subspace  L'.  We  have 
to  prove  that 

\\x-y\\>\\x\\  (2) 

equality  in  (2)  occurring  if  and  only  if  y  =  x'  (that  is  to  say, 
when  N  is  coincident  with  M',  Fig.  36). 


Fig.  36 


Set  x'  —y  =  y\.  Then  x  — y  =  x  -f-  y\  and 
\\x-y\?  =  ix-\-yx,  x  +  yx) 

=  II  x  ||2  +  II  «/i  IP  +  2  (x,  y\)  =  \\xf  +  II  y\  II2  (3) 

since  (x,  y\)=  0  because  of  the  orthogonality  of  the  vector  x  to 
the  subspace  L'  that  contains  y'. 

Observe  that 

\\yt  II2 = (y\,  y>)>  o 

since  the  metric  form  of  the  space  at  hand  is  positive  definite.  The¬ 
refore  (2)  follows  from  (3).  Equality  is  attained  in  (2)  if  and  only 
if  y i  =  0  (that  is,  when  y  =  x'). 


3.  Let 


L'  =  L(zh  ...,  zk) 


where  zt,  . . . ,  zh  is  a  finite  independent  set  of  vectors  in  L.  In 
this  case,  to  find  the  orthogonal  projection  x'  of  the  given  vector  x 
on  the  subspace  L'  it  suffices  to  make  a  suitable  computation  of 
the  coefficients  ai,  ....  ah  in  the  expansion 

x'  =  a,2,  +  •  •  •  +  a  kzk  (4) 

To  do  this  we  write  down  the  orthogonality  condition  of  the  vector 
x  —  x  —  x'  to  each  of  the  vectors  zy. 

(x  —  x’,  zj)  =  0 


(5) 


§  4]  ORTHOGONAL  PROJECTION.  ORTHOGONAL1ZATION  261 

Substituting  the  expansion  (4)  into  (5)  and  taking  advantage 
of  the  properties  of  a  scalar  product,  we  obtain  for  a,  a  system  of 
linear  equations: 

k 

X  (z<,  Z/) <*(  =  (*,  Z/),  j=  \  ,  k  (6) 

The  determinant  of  system  (6)  is  the  Gram  determinant  for  the 
positive  definite  quadratic  form  (x,  x)  and  the  independent  vectors 
z\,  ...,  ?/,.  It  is  therefore  positive  (see  Section  10,  Chapter  IV) 
and  system  (6)  is  uniquely  solvable.  The  desired  projection  will 
thus  be  found. 

4.  We  will  need  the  following  lemma  later  on. 

Lemma.  Given,  in  a  space  with  a  positive  definite  metric  form, 

a  system  of  pairwise  orthogonal  vectors  at . ak,  that  is, 

(a,-,  ah)=  0  for  i  k.  If  none  of  these  vectors  is  a  zero  vector, 
then  they  are  linearly  independent. 

Proof.  Consider  the  relation 

A,|0|  +  •  •  •  +  ^kUk  —  9  (7) 

From  the  scalar  product  of  (7)  by  a\\ 

K  (ai.  fl|)  +  Ma,.  a2)  +  •••  +Ma,,  a*)  =  (a,,  8)  (8) 

Since  a\  ^  0  and  the  metric  form  is  positive  definite,  it  follows 
that  (0|,  ai)=  II  0[  ||2  0.  The  remaining  scalar  products  in  the 

left  member  of  (8)  vanish  under  the  hypothesis  of  the  lemma; 
(ai,  0)=  0  because  of  the  participation  of  the  zero  vector.  Hence 
Xi  =  0.  Similarly  we  demonstrate  that  X2  ==■■■  —  h  —  0.  The 
proof  of  the  lemma  is  complete. 

5.  Given  in  space  L  an  ordered  set  of  linearly  independent  vec¬ 
tors  e\,  . . . ,  eh.  We  will  now  discuss  replacing  this  set  by  another 
set  of  vectors  that  is  orthogonal  to  and,  in  a  certain  sense,  equi¬ 
valent  to  the  given  set.  For  this  we  carry  out  a  geometrical  con¬ 
struction  called  the  process  of  orthogonalization.  It  resembles  the 
process  of  choosing  a  basis  when  reducing  a  quadratic  form  to 
canonical  form  by  the  Jacobi  method. 

A  new  system  of  vectors  ey . tv  is  constructed  with  the  fol¬ 

lowing  conditions  observed: 

(1)  et'  ^  L(e  1),  e2’  e  L  (eh  e>)  . . . ,  fj-eLje,,  . . .,  e /),  . . . ,  e^  e 

^  L  (b\,  •  *  • »  &k)^ 

(2)  the  vectors  e,',  ...,  ek ■  arc  pairwise  orthogonal; 

(3)  the  set  . ek’  is  linearly  independent. 

In  that  case  we  say  that  the  new  set  of  vectors  is  obtained  from 
the  original  set  e\,  . . . ,  eh  by  the  process  of  orthogonalization. 


262 


SPACES  WITH  QUADRATIC  METRIC 


[CH.  VIII 


If  the  given  system  consists  of  three  vectors  e\,  e2,  e3  in  three- 
dimensional  Euclidean  space,  then  we  construct  the  new  system 
ex>,  e2\  <Y  as  follows: 

we  retain  the  first  vector  (e,'  —e,); 

the  second  vector  is  drawn  orthogonal  to  it  in  the  plane  passing 
through  ei  and  e2 ; 

the  third  vector  is  drawn  orthogonal  to  this  plane  (Fig.  37). 


When  passing  to  greater  dimensionalities,  the  fourth  vector  is 
placed  perpendicular  to  the  given  three-dimensional  space,  and  so 
on.  In  the  general  case  we  set 

—  &l, 

e2'  =  e2+ae,',  (9) 

e3'  =  e3  +  P<?2'  +  Y<V> 


ek’  ek  ^ie(k-iy  "t"  ^2e(k-2y 


From  formulas  (9)  it  follows  that  the  vectors  et'  are  located  in  the 
required  linear  hulls  and  are  nonzero  because  of  the  independence 
of  the  vectors  e\ . ei,. 

It  remains  now  to  select  the  coefficients  a,  p,  ...  so  that  the  vec¬ 
tors  et>  are  pairwise  orthogonal.  Then  the  system  e,-,  ....  ek>  will 
be  independent  by  the  lemma  of  Subsection  4. 

We  find  a  and  have 


whence 


(<Y,  e,')  =  (e2>  <?,')  +  a  (<?,-,  e,')  =  0 

(e2.  er) 

“  («!'•  <V) 


(10) 


Division  is  possible  since (<?,',  e,')  =  (e,,  e{)  0.  The  vector  (—ae\) 

is  the  orthogonal  projection  of  e2  on  L(e{)  (Fig.  38). 

We  now  ensure  that  the  third  vector  is  orihogonal  to  the  first 
two: 

(<V.  e,')  =  (e3,  <y)  +  P(g2'.  gr)  +  V(g|--.  «r)  =  0, 

(«V  ,e2‘)  —  (e3,  e2>)  -f  p  (e2-,  e2')  +  vfci',  e2>)  =  0 


ORTHOGONAL  PROJECTION.  ORTHOGONALIZATION 


263 


§  4) 


The  underlined  terms  vanish  and  {e.y,  e2 ■)  #0  by  construction. 
Therefore  we  find 


(ev  <V)  „  («3-  e2') 

<V)  ’  P  (*•'•  er) 


OD 


Geometrically,  formulas  (9)  and  (II)  mean  that  to  construct 
vector  ey  it  is  necessary  to  subtract  from  vector  <?3  its  orthogonal 
projection  on  the  subspace  L(e i,  e2)  (Fig.  39). 


From  there  on  the  process  continues  in  similar  fashion. 

6.  In  the  orthogonalization  process  it  is  often  necessary  to  en¬ 
sure  two  more  supplementary  conditions: 

(4)  for  any  /(l<j^fc)  the  system  e,',  e is  oriented  like 

the  system  ex . e,\ 

(5)  Ik/'  11=  1. 

Formulas  (9)  guarantee  condition  (4).  Indeed,  from  (9)  we  have 
e,  =  e,', 

e2=  —  ae,' -j- e2',  „ 

ek=  —  hk-lel'  —  •••  +  C/;'  , 

so  that  in  the  matrix  expressing  e,  in  terms  of  e the  upper  left 
minor  of  order  /  (for  any  /  ^  k)  is  positive  (and  equal  to  +1). 

To  ensure  condition  (5),  it  suffices,  after  carrying  through  the 
orthogonalization,  to  divide  each  of  the  resulting  vectors  by  its 
norm. 

Remark.  It  is  easy  to  prove  (by  induction,  for  instance)  that  con¬ 
ditions  (1)  to  (5)  given  in  Subsections  5  and  6  uniquely  determine 
a  system  of  vectors  e,-,  ...,  el{-  from  the  given  system  ex,  ....  eh. 

7.  Legendre  polynomials.  In  mathematical  analysis  and  its  ap¬ 
plications,  one  makes  use  of  expansions  of  arbitrary  functions  in 
series  of  given  functions,  such  expansions  being  viewed  like  the 


264 


SPACES  WITH  QUADRATIC  METRIC 


(CM.  Vtll 


expansions  of  vectors  in  terms  of  a  given  basis.  It  is  then  con¬ 
venient  to  have  analogues  of  an  orthogonal  basis.  These  are  ortho¬ 
gonal  systems  of  functions.  An  elementary  instance  of  orthogonal 
systems  are  the  Legendre  polynomials. 

Introduced  in  the  space  of  continuous  functions  on  the  interval 
[—  1,  1]  is  a  quadratic  metric  with  scalar  product 

+  I 

(x,  y)=  $  x{t)y{t)dt  (12) 

“I 

Accordingly, 

+  i 

II X IF  =  5  x2(t)dt  (13) 

-i 

We  have  already  considered  (13)  and  have  demonstrated  that  this 
is  a  quadratic  form  (see  Section  4,  Chapter  IV).  Notice  that  it  is 
positive  definite:  ||  x  ||2  ^  0  with  ||  x  ||2  =  0  if  and  only  if  the  con¬ 
tinuous  function  x(t)  =  0  at  all  points  of  the  interval. 

Take  a  sequence  of  monomials 

1,  t,  t\  t\  ...  (14) 

and  apply  to  it  the  process  of  orthogonalization  to  get  the  sequence 
of  polynomials 

fo(t)=l,  /,(/)  =  /,  f2(t)  =  t2-±, 

f3(t)  =  t*-±t,  ...  (15) 

The  number-labels  of  the  polynomials  in  (15)  are  chosen  so  that 
they  coincide  with  their  powers.  The  coefficients  of  the  polynomials 
are  computed  from  formulas  (9)  with  account  taken  of  (10),  (11), 
(12)  and  (14). 

Following  a  special  normalization  like 

pk(t)  =  hfk(t) 

where  h,  are  chosen  from  the  condition 

P*(U=1  (16) 

we  get  a  sequence  of  polynomials  ph(t )  (of  degree  k  =  0,  1,  2,  . . .) 
which  are  called  Legendre  polynomials.  It  can  be  demonstrated 
that 

<17> 

Taking  into  account  the  remark  of  Subsection  6,  it  suffices  to  verify 
that  all  polynomials  (17)  are  pairwise  orthogonal  (it  is  convenient 


METRIC  ISOMORPHISM 


263 


$  5] 

here  to  make  use  of  integration  by  parts)  and  that  they  satisfy  the 
condition  (16). 

It  can  also  be  proved  that 

IIP*(/>lp“-2*TT 

Thus,  the  set  of  Legendre  polynomials  is  orthogonal  but  not  nor* 
malized  (the  norms  of  ph  are  not  equal  to  unity). 

§  5.  Metric  isomorphism 

1.  Definition.  Two  quadratic-metric  spaces  L  and  L'  are  said  to 
be  metrically  isomorphic  to  one  another  if  there  exists  a  linear  iso¬ 
morphism  between  them  under  which  the  scalar  product  of  any 
pair  of  vectors  in  L  is  equal  to  the  scalar  product  of  their  images 
in  L'.  Under  this  condition,  the  linear  isomorphism  is  called  a 
metric  isomorphism  (the  linear  isomorphism  is  discussed  in  Sec¬ 
tion  10,  Chapter  I). 

Remark.  Metrically  isomorphic  spaces  have  the  same  proper¬ 
ties,  not  only  linear  but  also  metric,  that  is  to  say,  they  are  based 
on  the  concept  of  a  scalar  product.  It  therefore  suffices  to  study 
one  of  a  set  of  metrically  isomorphic  spaces  in  order  to  have  a 
knowledge  of  all  of  them. 

2.  Theorem  1.  Quadratic-metric  spaces  with  the  same  dimensions 
and  the  same  positive  indices  are  metrically  isomorphic. 

Proof.  Let  L  and  L'  both  be  n-dimensional  and  let  them  have 
one  and  the  same  positive  index  k  (0  sg;  k  ^  n).  By  Section  4  we 
can  find  in  L  an  orthonormal  basis  e\,  . . . ,  en,  and  in  L'  an  ortho¬ 
normal  basis  e,',  ....  en>.  These  bases  have  the  same  number  of 
unit  vectors  (equal  to  k).  We  assume  that  in  each  of  them  the 
first  k  vectors  are  unit  vectors. 

Let  x  be  an  arbitrary  vector  of  L.  Expand  it  in  terms  of  the 
basis  e\,  . . . ,  en :  x  —  x'ei  -f-  .  .  .  +  x7le„.  To  the  vector  x  let  there 
be  associated  a  vector  x'  e  L'  which  has  the  same  components 
relative  to  the  basis  e,>,  ....  enr.  x'  =  x'e,'+  ...  +xne„'.  We 
have  thus  established  a  linear  isomorphism  between  L  and  L'  (see 
Section  10,  Chapter  I).  Now  consider  two  arbitrary  vectors  x,  y  of 
L  and  their  images  x',  y'  in  L' .  Since,  relative  to  the  bases 
elt  . . . ,  en  and  et>,  . . .,  en-,  the  metric  forms  of  the  spaces  L  and 
L'  have  the  same  component  representations  and  the  components 
of  the  vectors  x,  y  coincide  respectively  with  the  components  of  the 
vectors  x',  y',  it  follows  that  (x,  //)  =  ( x',  y').  Thus  the  linear  iso¬ 
morphism  established  between  L  and  L'  is  a  metric  isomorphism. 
The  theorem  is  proved. 

For  the  quadratic-metric  spaces  L  and  L'  the  following  theo¬ 
rem  holds  true. 


26fi 


SPACES  WITH  QUADRATIC  METRIC 


[CH.  VIII 


Theorem  2.  If  L  has  dimension  n  and  positive  index 
k  (0  ^  k^L  n)  and  L'  is  metrically  isomorphic  to  L,  then  L'  also 
has  dimension  n  and  positive  index  k. 

Proof.  Let  e t . en  be  an  orthonormal  basis  in  L  with  the 

first  k  vectors  being  unit  vectors.  Let  the  vectors  e,- . en-  e  L' 

correspond  to  the  vectors  et,  . . . ,  en  by  an  isomorphism.  Since  a 
metric  isomorphism  is  a  linear  isomorphism,  by  repeating  the  proof 
of  Theorem  2,  Section  10,  Chapter  1,  we  find  that  the  dimension  of 
L'  is  equal  to  n  and  that  e,-,  .  .  .,  constitute  a  basis  in  L'.  It  fol¬ 
lows  immediately  from  the  definition  of  a  metric  isomorphism  that 
the  basis  e,- . en>  e  L'  is  orthonormal  and  that  the  first  k  vec¬ 

tors  in  it  (and  only  those  vectors)  are  unit  vectors.  The  proof  of 
the  theorem  is  complete. 

Corollary.  Quasi-Euclidean  spaces  with  different  dimensions  or 
with  different  positive  indices  are  isomorphic.  A  Euclidean  space 
is  not  isomorphic  to  any  quasi-Euclidean  space. 

Remark.  However,  there  is  no  need  to  make  a  separate  study 
of  n-dimensional  quadratic-metric  spaces  with  positive  indices  k 
and  n  —  k.  It  is  sufficient  to  change  the  sign  of  the  metric  form  in 
one  of  them  in  order  to  obtain  another. 

§  6.  ft-orthogonal  matrices  and  ^-orthogonal  groups 

1.  We  consider  an  n-dimensional  quadratic-metric  space  with  a 
given  positive  index  k  (0  ^  k  ^  n).  In  this  space  take  any  two 

bases  ...,  e„  and  et- . en\  provided  that  they  are  both 

orthonormal  and  the  first  k  vectors  in  each  are  unit  vectors. 

By  the  usual  procedure  write  down  the  change-of-basis  formulas 
from  first  to  second: 

ei  ’=T,P'i’ei  (1) 

The  matrix  P  made  up  of  the  coefficients  of  these  formulas  is  of 
a  special  nature  in  this  case.  To  figure  it  out,  write  down  the 
metric  form  of  the  space  relative  to  the  basis  e\ . en: 

ii  v  ip  =  (.v1)2  +  . . .  +  (x«y  -  (,vfc+  ')2 -  ...  -  (x")2 
Along  witli  this  form  consider  the  matrix 

II1  011 


1 


(1) 


§6] 


* -ORTHOGONAL  MATRICES  AND  GROUPS 


267 


in  which  we  have  +1  in  the  first  k  places  on  the  main  diagonal 
and  — 1  in  the  remaining  places  of  the  main  diagonal  with  zeros 
elsewhere.  Clearly,  the  matrix  G  of  the  metric  form  relative  to 
the  basis  e1(  . . . ,  en  coincides  with  matrix  £&: 

G  ~  Ek 

By  virtue  of  our  conditions  concerning  the  bases  at  hand,  the 
metric  form  in  the  basis  e,’,  ....  e„-  has  exactly  the  same  form  (1) 
as  in  the  basis  eh  ....  e„.  And  so  matrix  G'  of  the  metric  form  in 
the  basis  e,',  . . . ,  en ■  is  also  equal  to  Ew. 

G'  =  Ek 

On  the  other  hand,  by  the  general  law  for  transformation  of  the 
matrix  of  a  quadratic  form  we  have 

G'  =  PGP ’ 

We  thus  conclude  that  if  P  is  a  change-of-basis  matrix  from  one 
orthonormal  basis  to  another,  then 

PEkP*  —  Ek  (2) 

It  is  essential  to  note  that  in  both  bases  it  is  precisely  the  first  k 
vectors  that  are  unit  vectors,  all  other  vectors  being  imaginary- 
unit  ones. 

It  is  easy  to  see  that  the  converse  has  been  proved  at  the  same 
time:  if  the  matrix  P  satisfies  condition  (2),  if  the  original  basis 
e\,  . . . ,  en  is  orthonormal,  and  if  the  first  k  vectors  in  it  are  unit 
vectors  (the  remaining  being  imaginary-unit  vectors),  then  the 
basis  e,'t  ...,  en'  obtained  by  (I)  will  also  be  orthonormal  and 
its  first  k  vectors  will  also  be  unit  vectors. 

2.  Definition.  Any  matrix  P  satisfying  condition  (2)  is 

termed  a  k-orthogonal  (0  ^  k  ^  n)  matrix. 

Note  that  this  definition  is  of  a  purely  algebraic  nature.  It  could 
be  given  quite  apart  from  the  geometry  of  quadratic-metric  spaces. 

3.  All  fe-orthogonal  matrices  are  nonsingular.  Indeed,  it  is  ob¬ 
vious  that  det  Eh  =  ±  1. 

From  this  and  from  (2) 

detP  •  det  P*=  1  (3) 

Hence  det  P  0.  And  from  (3)  it  is  also  clear  that  det  P  —  ±  1. 

4.  Because  of  (2)  we  have  P~lEh  =  EhP*  or  (since  EhEk  —  E) 

P~l  =  EhP'Ek  (4) 


268 


SPACES  WITH  QUADRATIC  METRIC 


|CH.  VIII 


We  see  that  the  matrix-inversion  operation,  which  in  the  general 
case  is  a  cumbersome  one,  reduces,  for  /e-orthogonal  matrices,  to 
the  operation  of  taking  the  transpose  and  multiplying  by  Eh  (the 
latter  merely  denotes  a  change  in  the  sign  of  certain  elements). 

5.  Theorem,  k-orthogonal  matrices  constitute  a  subgroup  of  the 
group  of  all  nonsingular  n  X  n  matrices. 

We  denote  the  subgroup  by  Oh  and  will  speak  of  the  fc-ortho- 
gonal  subgroup  (or  group). 

Proof.  For  the  time  being  let  Oh  merely  denote  the  set  of  all 
Ai-orthogonal  «X«  matrices.  From  Ok  take  any  two  matrices  Pu 
P2;  then  P\EkP'  =  Ek,  P>EkPt>  =  Ek,  whence 

(Pi  Pi)  Ek  (P,P2)*  =  P,  (PSE*P5)  P\  =  P,£*PI  =  Ek 


Thus  if  P|GOt  and  P2eO*,  then  P,P2^Ok.  • 

From  Ok  take  an  arbitrary  matrix  P;  then  PEkP*  =  Ek.  From 
this  and  because  of  (4) 

P~'Ek  (P-'Y  =  P~'Ek  (EkP'Ek)'  =  P~'Ek  (EkPEk)  =  P“  EkPEk  =  Ek 

Thus  if  P  e  Oh,  then  P-1  e  Oh  and  the  proof  is  complete. 

Remark.  By  Subsection  3,  all  Oh  (0  ^  k  ^  n)  lie  in  the  sub¬ 
group  of  n  X  n  matrices  with  unit  modulus  of  the  determinant. 

6.  The  set  of  all  orthonormal  bases  in  quadratic-metric  space 
with  a  given  positive  index  is  nothing  but  a  class  of  bases  defined 
with  respect  to  the  group  Oh  by  some  one  orthonormal  basis  of 
the  space  (see  Section  1,  Chapter  VI). 

The  geometry  of  quadratic-metric  space  has  as  its  object  the 
study  of  invariants  with  respect  to  the  group  Oh  in  the  class  of 
orthonormal  bases.  We  have  in  mind  here  invariants  in  the  broad 
sense  of  the  word;  namely,  not  only  invariant  numerical  quantities 
(like,  say,  scalar  products,  the  norm  of  a  vector),  but  also  inva¬ 
riant  objects  (say,  planes)  and  invariant  relations  (for  instance, 
the  orthogonality  relation). 

7.  Observe  that  any  class  of  bases  relative  to  the  group  Oft  is  a 
class  of  orthonormal  bases  in  some  (quite  definite)  quadratic 
metric. 

Indeed,  let  eu  . . . ,  en  be  an  arbitrary  basis  of  a  linear  space  L. 
By  Subsection  2,  Section  3,  there  exists  a  (quite  definite)  quadratic 
metric  in  which  the  basis  eit  . . . ,  en  is  orthonormal  and  has  the 
first  k  vectors  for  unit  vectors.  Then  the  class  of  bases  defined  with 

respect  to  the  group  Oh  by  the  basis  e\ . en  will  consist  of 

orthonormal  bases  in  precisely  this  metric. 


§6] 


^-ORTHOGONAL  MATRICES  AND  GROUPS 


269 


8.  Conclusion.  Thus  the  set  of  all  bases  of  an  n-dimensional 
linear  space  splits  up  into  classes  with  respect  to  the  group  Oh  so 
that  to  each  class  there  corresponds  a  specific  quadratic  metric  in 
which  the  bases  of  this  class  are  orthonormal. 

At  the  same  time  there  is  defined  an  infinite  set  of  quadratic- 
metric  spaces  on  one  and  the  same,  to  put  it  pictorially,  linear 
“skeleton”  L.  They  are  all  metrically  isomorphic  to  one  another. 
The  geometries  of  these  spaces  are  algebraically  identical  since 
they  all  have  as  their  object  of  study  the  invariants  of  the 
group  Oh-  However,  from  the  viewpoint  of  the  linear  space  L,  these 
quadratic-metric  spaces  are  distinct  for  the  reason  that  one  and 
same  pair  of  vectors  x,  y  in  L  have  in  them  distinct  numerical  va¬ 
lues  of  the  scalar  product.  This  will  all  shortly  be  explained  in 
examples  (see  Sections  7  and  8). 

9.  Assume  that  the  quadratic  metric  has  been  chosen.  Let  the 
^-orthogonal  matrix  P  define  a  transition  from  the  orthonormal 
basis  e\,  ...,  en  to  the  orthonormal  basis  e,',  . ..,  en'.  Also  con¬ 
sider  the  corresponding  transformation  of  components  (coordina¬ 
tes): 

*''=EQ/V  (5) 

The  matrix  of  this  transformation  is  Q  =  (P*)_1.  To  see  that  the 
matrix  Q  is  also  ^-orthogonal,  consider  the  following  chain  of 
equations: 

QEkQ *  =  (P*r‘  EkP~x  =  (PEkP'r  =  E j'  =  Ek 

We  obtain  the  relation  QEhQ*  =  Eh,  which  establishes  the  ^-or¬ 
thogonality  of  Q. 

Thus,  in  changing  from  one  orthonormal  basis  to  another  one, 
the  components  of  an  arbitrary  vector  are  subjected  (as  variables) 
to  a  linear  transformation  with  a  ^-orthogonal  matrix. 

Remark.  The  linear  transformations  (5)  of  the  variables 
x\  ....  xn  to  the  variables  x1',  ...,  xn'  with  the  ^-orthogonal 
matrix  Q  may  be  characterized  without  resorting  to  a  transforma¬ 
tion  of  bases  and,  accordingly,  without  resorting  to  matrix  P.  It 
is  precisely  such  linear  transformations,  and  only  such  transforma¬ 
tions,  that  preserve  the  normal  form  of  a  quadratic  form.  In  other 
words,  if  matrix  Q  is  ^-orthogonal  (and  only  in  this  case),  then  we 
have  the  identity 

(x>')2+  ...  +(x*')2-(x<*+'>')2-  ...  -(x"')2 

=  (x')2+  ...  +(xft)2-(x<*+'>)2-  ...  -(xn)2 


in  the  left-hand  member  of  which  xr,  ....  xn'  are  expressed  by 
formulas  (5). 


270 


SPACES  WITH  QUADRATIC  METRIC 


[CH.  Vltl 


10.  From  the  foregoing  it  is  clear  that  the  set  of  all  linear  trans¬ 
formations  of  variables  with  fc-orthogona!  matrices  constitutes  a 
group  that  is  isomorphic  to  the  group  Oft;  the  isomorphism  here 
can  be  a  mapping  which  to  the  matrix  P  e  Oh  associates  a  linear 
transformation  with  matrix  Q  =  (P*)_1.  This  is  evident  from  form¬ 
al  operations  with  matrices.  If  P\,  P,^Ok  and  Q,  =  (PI)_I,  Qj  = 
=  (P.*r',  then  QQi  =  (P;)'I(P;rl  =  (P;P;)'1  =  ((P|P2)r'.  We  see 
that  to  a  product  of  matrices  Pi,  P2  corresponds  a  product 
of  their  images,  which  serves  as  the  condition  of  an  isomorphism. 

§  7.  The  group  of  Euclidean  rotations 

1.  In  the  two-dimensional  case  there  are  two  metrically  isomor-* 
phic  spaces  with  positive  index  k  =  1  and  k  —  2  respectively. 

If  k  =  2,  then  the  metric  form  relative  to  an  orthonormal  basis 
is 

II  *ll2  —  C*1)2  +  (*2)2  (1) 

To  it  there  corresponds  the  geometry  of  the  ordinary  Euclidean 
plane  (where  the  scalar  product  is  given  by  the  familiar  formula 
(x,  y)  =  x'y'  -f  x2y2,  where  the  angle  between  vectors  is  defined, 
where  the  trigonometric  functions  of  angles  are  given  in  terms  of 
coordinates  by  the  familiar  formulas  of  elementary  analytic  geo¬ 
metry,  and  so  on).  If  k  =  1,  then 

II  x  ||2  =  (x1)2  (x2)2  (2) 

The  metric  form  (2)  is  associated  with  a  two-dimensional  Min¬ 
kowski  geometry. 

2.  Let  us  investigate  form  (1).  We  begin  with  a  consideration 
of  ^-orthogonal  matrices.  Incidentally,  note  from  the  very  start 
that  for  n  =  2,  k  —  2  (and  in  general  for  k  —  n)  ^-orthogonal 
matrices  are  simply  called  orthogonal  matrices. 

When  n  —  2  and  k  —  2,  we  have  Eh  =  E.  For  this  reason,  in 
the  given  special  case,  the  general  condition  PEkP*  =  Eh  for 
^-orthogonality  of  matrix  P  assumes  the  form:  PP*  =  E. 

Let 


From  what  has  been  said,  this  matrix  is  orthogonal  if  and  only  if 

a  P  ay  10 
Y  6  p  6  ^  0  1 


S  7] 


THE  GROUP  OF  EUCLIDEAN  ROTATIONS 


271 


whence 

a2  +  p2  =  1 ,  aY  +  p6  =  0,  j 
Ya  +  6p  =  0,  y24  62==1  J 


(3) 


Here  we  have  three  different  equations.  The  system  is  simple  and 
there  is  no  difficulty  in  finding  all  solutions.  From  the  second 
equation  of  (3)  we  can  write  y  =  —  A.p,  A  =  +Xa,  where  X  is  a 
new  unknown.  Substituting  these  expressions  into  the  last  equa¬ 
tion  of  the  system,  we  get 

Y*4  62  =  A-2(a2  +  p2)  =  A,2  =  , 


Thus,  X  =  ±  1.  To  determine  the  geometric  meaning  of  choice  of 
sign,  compute  the  determinant  of  matrix  P: 


det  P  = 


P 

A,a 


=  A,  (a2  -f-  p2)  =  X 


Consequently,  to  the  values  X  =  ±  1  correspond  transformations 
of  the  basis  with  orientation  preserved  or  disrupted,  respectively. 

Now  take  advantage  of  the  equation  a2  +  p2  =  1.  Due  to  this 
equation,  a  =  cos  0,  p  =  sin  0,  where  0  is  an  arbitrary  parameter. 
At  the  same  time  we  have  Y  =  — X  sin  0,  6  =  X  cos  0.  We  have  thus 
found  all  solutions  of  system  (3)  and,  respectively,  all  orthogonal 
matrices  (only  for  n  =  2  of  course).  Let  us  confine  ourselves  to 
X  =  +  1.  Then 


cos0 

sin  0 

—  sin  0 

COS0 

Formula  (4)  yields  all  orthogonal  matrices  for  which  det  P  >  0 
(that  is,  det  P  —  +  1).  It  is  easy  to  sec  that  by  themselves  they 
constitute  a  group  (a  subgroup  of  the  orthogonal  group).  This 
is  also  evident  from  the  following  two  equations: 


cos  0, 

sin  0, 

cos  02 

sin  02 

—  sin  0, 

COS0, 

—  sin  02 

cos  02 

cos  (0,  -f  0  .)  sin  (0|  +  02) 
-sin  (0,  4-0,,)  cos  (0,  +  02)  ’ 


cos0  sin0 

-i 

cos  (—  0) 

sin  (—  0) 

—  sin  0  cos  0 

—  sin  (—  0) 

cos  (—  0) 

They  can  readily  be  verified  and  they  express  the  fact  that  the  pro¬ 
duct  of  matrices  like  (4)  and  the  inversion  of  a  matrix  like  (4) 
yield  matrices  of  the  same  type. 


272 


SPACES  WITH  QUADRATIC  METRIC 


[CH.  VIII 


3.  Let  us  take  an  orthonormal  basis  <?i,  e2  on  the  Euclidean 
plane.  Referring  to  Fig.  40  we  see  that  the  orthogonal  vectors  eit  e2 
issue  from  the  origin  and  terminate  on  the  unit  circle  (x')2  + 
+  (v2)2  =  1. 

Let  us  pass  to  a  new  basis  via  matrix  P  of  type  (4): 

e,'  —  e,cos0  sin 0,  ^ 

e2’  —  —  e,  sin0  +  e2cos0  j  ^ 

By  virtue  of  the  first  of  these  equations,  the  vector  e\>  is  a  unit 
vector  and  forms  with  vector  e\  an  angle  0  with  the  ordinary  con¬ 


dition  for  orientation  of  angles  (i.e.,  taking  into  account  sign,  as 
is  usual  in  trigonometry).  The  second  equation  can  be  written  as 

e2>  =  eK  cos  (6  +  y)  +  e2  sin  (0  +  y) 

From  this  it  follows  that  the  vector  e2>,  being  a  unit  vector,  forms 
with  the  vector  e\  an  angle  of  0  -f  Hence  with  the  vector  e2  it 

forms  the  same  angle  0  as  the  vector  et'  does  with  the  vector  e\. 
In  other  words,  the  basis  ex>,  e2'  results  from  a  rotation  of  the 
basis  eu  e2  through  the  angle  0. 

Thus,  the  orthonormal  basis  et,  e2,  taken  arbitrarily  in  a  Eucli¬ 
dean  plane,  defines,  with  respect  to  the  group  of  matrices  (4),  a 
class  of  bases  that  result  from  a  rotation  of  the  basis  et,  e2  through 
all  possible  angles.  They  are  all  orthonormal  and  identically  orient¬ 
ed  with  the  basis  et,  e2. 

Remark.  In  order  to  obtain  a  class  of  bases  with  respect  to  the 
entire  orthogonal  group  by  proceeding  from  the  basis  eu  e2,  it  is 
necessary  to  make  the  additional  construction  of  a  class  of  bases 
with  respect  to  the  group  of  matrices  (4)  by  taking  eu  —e2  for  the 
original  basis. 

4.  To  a  transformation  of  the  basis  via  matrix  P  there  corres¬ 
ponds  a  transformation  of  coordinates  with  matrix  Q  =(P*)~1.  In 
this  case  it  is  PP*  =  E,  whence  Q  =  P.  Hence  if  the  basis  trans- 


§  71  THE  GROUP  OF  EUCLIDEAN  ROTATIONS  273 

forms  via  formulas  (5),  then  the  components  of  an  arbitrary  vector 
transform  via  the  formulas  with  the  same  matrix: 

x1'  —  jc1  cos  0  +x2sin8,  | 

x2’  —  —  x1  sinO  +  x2cos  0  J  ^ 

5.  We  have  just  considered  relations  (6)  as  formulas  of  trans¬ 
formation  of  the  components  of  the  given  vector  x  =  xxex  -f-  x2e2 
under  a  rotation  of  the  basis  eu  e2  (Fig.  41a).  These  same  formulas 
may  be  viewed  from  a  different  standpoint.  Namely,  we  can  assume 
that  the  basis  e\,  e2  does  not  change  and  that  formulas  (6)  asso¬ 


rt 

Fig.  41 

ciate  with  the  arbitrary  vector  x  =  xlei  +  x2e2  a  new  vector 
x'  =  xre,  +  xre2.  In  this  sense,  the  formulas  (6)  constitute  a 
component  (coordinate)  representation,  relative  to  the  basis  et,  e2, 
of  a  linear  transformation  of  the  Euclidean  plane.  Let  us  denote  it 
by  Iq.  By  formula  (6),  the  vector  x'  =  Iqx  has  the  same  norm  as 
the  vector  x  and  is  obtained  via  a  rotation  of  x  through  the  angle 
( — 0)  (Fig.  416).  Since  the  angle  0  is  the  same  for  all  vectors, 
they  all  rotate  in  the  same  fashion  under  the  linear  transformation 
x'  —  IqX.  For  this  reason,  the  linear  transformation  /e  is  called  a 
rotation  of  the  Euclidean  plane  through  the  angle  ( — 0). 

The  set  of  all  rotations  (that  is,  through  all  possible  angles) 
constitutes  the  group  of  rotations  of  the  Euclidean  plane.  It  is  iso¬ 
morphic  to  the  group  of  matrices  of  type  (4),  which  for  this  reason 
is  also  called  a  rotation  group. 

6.  A  transformation  that  preserves  the  metric  of  the  space  is 
said  to  be  isometric. 

We  now  confine  ourselves  to  some  examples  and  leave  a  closer 
study  of  isometric  transformations  to  the  next  chapter. 

7.  Under  any  rotation  x'  =  hx  the  metric  properties  of  the 
images  coincide  with  the  metric  properties  of  the  inverse  images 


274 


SPACES  WITH  QUADRATIC  METRIC 


rcn.  VIII 


(the  norm  of  an  image  is  equal  to  the  norm  of  the  inverse  image: 
||  x'  ||  =  ||  x  ||;  a  scalar  product  of  images  is  equal  to  the  scalar 
product  of  the  inverse  images:  (*',  y')  =  (x,  y)).  Therefore  every 
rotation  lo  is  an  isometric  transformation.  Also  included  in  iso¬ 
metric  transformations  of  the  Euclidean  plane  are  reflections  about 
a  straight  line;  for  example,  the  transformation  xr  =  x],  xr  =  —  x2. 

Proof  is  given  below  (see  Sections  7,  8,  Chapter  IX)  that  an 
arbitrary  isometric  transformation  on  a  two-dimensional  Euclidean 
plane  is  determined  relative  to  an  orthonormal  basis  by  a  compo¬ 
nent  representation  with  an  arbitrary  orthogonal  matrix  ^with  de¬ 
terminant  of  any  sign).  It  is  either  a  rotation  of  the  plane  through 
an  angle,  or  a  reflection,  or  the  composition  of  a  reflection  followed 
by  a  rotation. 

Remark.  We  disregard  parallel  translations  of  the  Euclidean 
plane  that  displace  the  origin  of  coordinates  because  for  the  pre¬ 
sent  we  view  the  Euclidean  plane  as  a  vector  space. 

8.  Considered  in  the  geometry  of  the  Euclidean  plane  are  the  in¬ 
variants  of  the  orthogonal  group.  Here  the  fact  of  invariance  is  of 
a  purely  algebraic  nature.  For  example,  the  invariance  of  the 
norms  of  vectors  means  the  identity  (x1')2  4 (xr)2  =  (*‘)2  -f  (jc2)2  as 
a  consequence  of  formulas  (6)  or  the  formulas  xv  —  x\  x2'  —  —  x2. 
From  the  geometrical  standpoint,  two  views  are  possible. 
If  the  orthogonal  group  is  regarded  as  a  group  generating  the 
class  of  orthonormal  bases,  then  invariance  under  this  group 
signifies  equivalence  of  such  bases.  If  the  orthogonal  group  is 
viewed  as  a  group  generating  isometric  linear  transformations, 
then  invariance  under  this  group  signifies  preservation  of  the 
metric  properties  of  the  figures  (systems  of  vectors)  under  rota¬ 
tions  and  reflections. 

9.  We  have  already  pointed  out  that  in  one  and  the  same  linear 
space  it  is  possible  to  introduce  a  metric  in  different  ways  by  tak¬ 
ing  distinct  bilinear  forms  for  the  scalar  product.  Let  us  illustrate 
this  fact  using  the  Euclidean  plane. 

Elementary  geometry  states  that  in  the  Euclidean  plane  it  is 
possible  to  compare  lengths  and  to  measure  angles.  Let  a  scale 
unit  be  chosen  and  the  unit  vectors  au  a2  (orthogonal  from  the 
elementary  viewpoint)  be  taken  for  a  basis.  Then  we  can  introduce 
the  scalar  product  (x,  y)  and  put 

(*,  y)  =  x'y1  +  x2y2  (7) 

where  x  =  x'a\  +  x2a2,  y  —  y'a\  +  y2a2.  By  Sections  1,  2  we  can 
define  the  lengths  of  all  vectors  and  the  concept  of  orthogonality, 
and,  by  a  familiar  formula  of  elementary  analytic  geometry,  it 
is  possible  to  express  the  angle  between  two  vectors  in  terms  of 


§7] 


THE  GROUP  OF  EUCLIDEAN  ROTATIONS 


275 


their  lengths  and  the  scalar  product.  Then  the  lengths  and  angles 
determined  via  the  scalar  product  (7),  which  is  given  relative  to 
the  basis  a\,  a2,  will  coincide  with  the  lengths  and  angles  deter¬ 
mined  in  elementary  plane  geometry. 

Now,  on  this  same  plane,  we  will  examine  a  geometry  that  is 
introduced  artificially  in  addition  to  the  natural  geometry.  To  do 
so,  we  take  some  nonorthogonal  basis  e,,  e2  in  addition  to  the 
basis  fli,  a2.  In  Fig.  42  we  have,  for  convenience,  taken  e2  as  a 
unit  vector  and  e\  as  a  vector  exceeding  unity  in  length  and  ortho¬ 
gonal  to  the  vector  e2.  Proceeding  from  this  basis,  we  construct  a 
class  of  bases  with  respect  to  the  group  of  matrices  (4),  that  is  to 


say,  by  formulas  (5).  In  the  plane  we  now  introduce  a  new  quad¬ 
ratic  metric  by  determining  the  scalar  product  by  the  same  for¬ 
mula  (7),  but  this  time  we  will  assume  that  x\  x2,  y',  y2  are  the 
components  of  the  vectors  x,  y  relative  to  the  basis  et,  e2  shown  in 
Fig.  42.  In  the  new  metric,  the  vectors  e\,  e2  are  orthogonal  and 
have  unit  lengths;  the  basiser,  e-y  defined  by  formulas  (5)  is  also 
orthonormal  for  any  value  of  0.  In  short,  in  the  new  metric 
everything  up  to  this  point  is  repeated.  But  from  the  standpoint  of 
the  old  metric,  everything  is  depicted  in  a  distorted  manner.  For 
example,  a  unit  circle  which  in  the  basis  e\,  e2  of  Fig.  42  is  given 
by  the  equation  (x')2  +  (x2)2  =  1  is  a n  ellipse  in  the  sense  of  the 
old  metric.  An  arbitrary  orthonormal  basis  defined  by  formulas  (5) 
is  composed  of  the  vectors  er,  ey,  which  in  the  old  metric  are  not 
orthogonal  and  lie  on  two  conjugate  diameters  of  the  ellipse 
(x1)2  +  (x2)2—  1.  To  see  the  truth  of  these  remarks,  it  suffices  to 
establish  a  metric  isomorphism  between  the  Euclidean  plane  with 
its  original  metric  and  the  same  plane  with  its  new  metric.  By 
Section  5  (see  the  proof  of  Theorem  1),  we  obtain  a  metric  iso¬ 
morphism  if  we  establish  a  linear  isomorphism  in  which  the  bases 


27fi 


SPACES  WITH  QUADRATIC  METRIC 


[CH.  VIII 


depicted  in  Figs.  40  and  42  correspond  to  each  other.  For  the 
sake  of  pictorialness,  we  assume  that  the  Euclidean  plane  is  in  the 
form  of  two  copies,  P\  and  P2,  corresponding  to  Fig.  40  and 
Fig.  42.  Situate  P i  and  P2  in  three-dimensional  Euclidean  space  as 
they  are  shown  in  Fig.  43.  That  is,  bring  to  coincidence  the  vectors 
denoted  by  e2  in  planes  Pi  and  P2  (this  can  be  done  since  we  took 
them  to  be  unit  vectors);  then  rotate  P2  about  e2  bringing  it  to  a 
position  in  which  the  endpoints  of  the  vectors  denoted  by  e,  lie  on 
a  single  perpendicular  to  the  plane  Pt.  Now,  with  each*vector  x 
of  P\  we  associate  a  vector  x'  of  P2  that  projects  orthogonally  on 


A' 


Fig.  44 


P\  into  vector  x.  This  correspondence  is  clearly  a  linear  isomor¬ 
phism;  at  the  same  time  it  is  a  metric  isomorphism  since  to  the 
orthonormal  basis  of  plane  Pi  corresponds  a  basis  of  plane  P2 
that  is  orthonormal  in  its  new  metric. 

From  our  construction  we  immediately  perceive  the  truth  of  the 
foregoing  remarks.  Namely,  that  the  unit  circle  in  the  new  metric 
of  plane  P2  is  an  ellipse  in  the  old  metric;  that  the  bases  which  are 
orthonormal  in  the  new  metric  are  composed  of  vectors  that  lie 
along  conjugate  diameters  of  the  ellipse.  For  the  new  metric  pro¬ 
perties  of  vectors  in  P2  we  take  the  properties  of  their  inverse 
images  in  P\  (namely,  for  the  norm  of  a  vector  in  P2  we  take  the 
norm  of  its  inverse  image  in  Pi;  for  the  scalar  product  of  two 
vectors  in  P2  we  take  the  scalar  product  of  their  inverse  images  in 
Pi).  Note,  in  particular,  that  the  linear  transformation  lg,  which,  in 
the  basis  <>i,  i‘2  of  P2,  has  the  component  representation  (6),  is  a 
rotation  of  the  plane  P2  in  the  sense  of  the  new  metric,  whereas  in 
the  original  metric  this  transformation  is  the  so-called  elliptic  ro¬ 
tation  of  the  Euclidean  plane.  The  name  is  related  to  the  fact  that 
if  the  parameter  0  changes,  then  the  image  x'  =  Igx  of  a  fixed 
vector  x  describes  with  its  endpoint  an  ellipse  passing  through  the 
end  of  vector  x.  To  distinct  vectors  x,  y,  z  correspond  ellipses  that 
are  similar  and  similarly  situated  (see  Fig.  44;  all  this  is  easy  to 
grasp  if  wc  revert  to  Fig.  43).  Fig.  44  depicts  two  hatched  figures, 
one  of  which  is  carried  into  the  other  by  an  elliptic  rotation.  In 
the  geometry  that  we  artificially  introduced  in  the  plane,  these  two 


THE  GROUP  OF  EUCLIDEAN  ROTATIONS 


277 


§  71 

figures  are  to  be  regarded  as  identical  (congruent),  that  is  to  say, 
superimposable. 

10.  The  basis  eit  e2  could  have  been  taken  quite  arbitrarily  and, 
assuming  that  x  =  xlet  -f  x2c2,  y  =  y'e\  +  y2e2,  we  could  introduce 
the  scalar  product  by  the  formula 

2 

(*.  y)  =  e  (v,  y)=  T.  anx‘yl  (8) 

1.  /=! 

where  the  bilinear  form  g(x,  y)  is  chosen  at  pleasure,  so  long  as 
the  quadratic  form  g(x,  x)  is  positive  definite.  From  Section  5  it 
follows  that  we  will  obtain  a  two-dimensional  space  metrically  iso¬ 
morphic  to  the  Euclidean  plane.  Taking  advantage  of  the  positive 
definiteness  of  g(x,  x),  it  is  easy  to  prove  that  circles,  that  is,  the 
curves 

II x  ||2  =  g (x,  x)  =  constant 

on  the  plane  with  the  scalar  product  (8)  are,  from  the  elementary 
point  of  view,  ellipses. 

We  thus  see  that  on  one  and  the  same  plane  it  is  possible  to 
specify  an  infinite  number  of  distinct  Euclidean  metrics.  To  picture 
this  more  vividly,  notice  that  any  ellipse  with  centre  at  the  zero 
point  is  a  unit  circle  in  some  one  (quite  definite)  Euclidean  metric. 
Hence,  there  are  as  many  Euclidean  metrics  on  the  plane  as  there 
are  distinct  ellipses  with  common  centre. 

11.  Quite  naturally,  in  linear  spaces  of  greater  dimensionality 
it  is  also  possible  to  introduce  different  metrics  that  are  isomorphic 
to  one  another.  Thus,  for  instance,  in  the  space  of  functions  con¬ 
tinuous  on  the  interval  [ — 1,  1]  we  can  introduce  the  scalar  pro¬ 
duct 

(x,y)=  $qp  (t)  x  (t)  y  (t)  dt  (9) 

-1 

where  cp(/)  is  an  arbitrarily  chosen  positive  continuous  function. 
Then  in  place  of  formula  (13),  Section  4,  we  have 

i 

II X  (t)  II2  =  $q>  (/)*’(/)<« 


The  resulting  space  is  metrically  isomorphic  to  the  space  of  conti¬ 
nuous  functions  with  scalar  product  (12),  Section  4,  specified  on 
[ — 1,  1].  The  metric  isomorphism  between  them  is,  for  instance,  the 
map  carrying  x(t)  into  *(/)  V<p(f)  • 


SPACES  WITH  QUADRATIC  METRIC  [CH.  VIII 

Note  in  passing  that  the  pairs  of  functions  for  which  the  scalar 
product  (9)  vanishes  are  called  orthogonal  with  weight  q>(/)  on  the 
interval  [—1,  1]. 

§  8.  The  group  of  hyperbolic  rotations 

1.  We  now  turn  to  the  (two-dimensional)  geometr/  of  Min¬ 
kowski.  We  will  carry  out  all  constructions  in  the  ordinary  Eucli¬ 
dean  plane.  In  it  we  take  an  orthonormal  basis  et,  e2  and  introduce 
the  Minkowski  metric  with  the  aid  of  the  quadratic  form 

II  x  ||2  =  (x1)2  —  (x2)2  (1) 

relative  to  the  basis  eit  e2.  Accordingly  we  have  a  formula  for  the 
scalar  product 

(x,  y)  =  x‘y'  —  x2y2  (2) 

In  this  metric,  II  ei  ||2  =  1,  ||  e2  |l2  —  — 1,  («i,  e2)=0.  Thus,  the 
basis  e\,  e2  is  also  orthonormal  in  the  metric  (1);  here,  et  is  a  unit 
vector  and  e2  is  an  imaginary-unit  vector. 

In  order  to  get  a  feeling  of  the  peculiarity  of  the  metric  (1),  it 
is  best  to  begin  with  a  consideration  of  the  unit  circle.  This  is  the 
term  used  to  describe  the  locus  of  endpoints  of  all  possible  vectors 
whose  norms  are  equal  to  unity  in  absolute  value  (we  assume  that 
all  vectors  issue  from  the  zero  point).  In  the  given  basis,  the  unit 
circle  is  defined  by  the  equation  |  (a:1  ) 2 — (x2)2|  =  1,  whence  either 
(x1)2 — (x2)2  —  I  or  (x1)2 — (x2) 2=  — 1.  In  Euclidean  geometry, 
these  two  equations  define,  relative  to  the  basis  e\,  e2,  conjugate 
equilateral  hyperbolas  whose  common  asymptotes  are  the  bisectors 
of  the  quadrantal  angles.  Thus,  in  the  Minkowski  metric,  the  unit 
circle  consists  of  two  Euclidean  hyperbolas,  on  one  of  which  lie  the 
endpoints  of  the  unit  vectors  and  on  the  other  the  endpoints  of  the 
imaginary-unit  vectors  (Fig.  45).  In  contrast  to  this  terminology, 
some  use  the  term  unit  circle  to  mean  only  the  first  of  the  hyper¬ 
bolas,  the  other  being  called  the  imaginary-unit  circle. 

Consider  an  arbitrary  vector  x  =  x'et  +  x2e2  extending  along 
one  of  the  asymptotes  of  these  hyperbolas;  in  this  case  |x‘|  =  |x2| 
and  therefore  ||  x  ||2  =  0.  Thus,  on  the  asymptotes  lie  isotropic  vec¬ 
tors,  that  is,  vectors  with  zero  norm. 

2.  Important  remark.  The  triangle  inequality  does  not  hold  true 
in  the  Minkowski  plane.  This  is  immediately  evident  in  the  case  of 
triangle  OAB  (Fig.  46),  where  the  vectors  OA  and  AB  are  iso¬ 
tropic  (parallel  to  the  asymptotes  of  the  hyperbolas).  If  we  denote 
OA  —  x,  AB  —  y,  then 

IU  +  '/ll>ll*ll  +  lli/ll=o 


§  8] 


THE  GROUP  OF  HYPERBOLIC  ROTATIONS 


279 


or,  in  terms  of  distances, 

P  (0,  B)  >  p(0,  4)  +  p  (A,  B)  =  0 

It  can  be  demonstrated  that  in  any  quasi-Euclidean  space  there 
are  three  points  for  which  the  triangle  axiom  does  not  hold  true. 
We  leave  the  proof  to  the  reader. 

3.  Let  x,  y  be  two  nonisotropic  vectors.  Suppose  they  are  per¬ 
pendicular  to  one  another  in  the  sense  of  Minkowski;  we  shall  at¬ 
tempt  to  describe  what  their  perpendicularity  means  [rum  the  Eucli¬ 
dean  standpoint.  From  (2)  we  have  (x,  y)  =  x'y'  — x2y 2  =  0;  then 
( (x1)2 — (x2)2)  •  ( (y')2—  (y2)2)  =  — ( x2y 1  —  x'y2)2,  whence  it  follows 


Fig.  45  Fig.  46 


that  II  x  ||2  =  (x1)2  —  (x2)2  and  II  y  II2  =  (y1)2  —  (y2)2  are  numbers 
of  distinct  signs.  Hence,  if  one  of  the  vectors  x,  y  has  a  real  norm 
in  the  Minkowski  metric,  then  the  other  vector  has  an  imaginary 
norm;  in  the  Euclidean  sense,  this  means  that  the  vectors  x,  y  or 
their  extensions  intersect  different  hyperbolas  (x1)2  —  (x2)2— ±  I. 
Since  we  are  now  interested  only  in  the  directions  of  the  vectors 
x,  y,  we  can,  without  any  loss  of  generality,  assume  that  their  end¬ 
points  lie  on  the  unit  circle  of  the  Minkowski  metric.  For  example, 
let  (x1)2  — (x2)2  =  1,  (y')2—(y2)2——  1.  Then  if  (x,  y)  — 

=  x'y'  - —  x2y2  =  0,  it  will  follow  that  (//'  —  x1)2 — (y2  —  x2)2  =  0; 
conversely,  from  the  latter  relation  it  follows  that  (x,  y)  —  0.  But 
the  equation  (y'  —  x1)2 — (y2  —  x2)2—  0  means  that  the  difference 
y  —  x  of  the  vectors  x,  y  is  an  isotropic  vector,  which  means  that 
it  is  directed  along  some  quadrantal  bisector.  But  this  is  equivalent 
to  the  vectors  x,  y  being  symmetric  with  respect  to  another  bisec¬ 
tor. 


280 


SPACES  WITH  QUADRATIC  METRIC 


[CH.  VIII 


To  summarize,  then,  the  vectors  x,  y  are  orthogonal  to  each  other 
in  the  sense  of  Minkowski  if  and  only  if  in  the  Euclidean  sense 
they  are  located  on  rays  that  are  symmetric  relative  to  on^of  the 
quadrantal  bisectors  (Fig.  47). 

Observe  that  the  isotropic  vector  x  (lying  on  the  bisector)  is 
orthogonal  to  itself:  ( x ,  x)  =  ||  x  ||2  =  0. 

4.  The  material  presented  in  Subsection  3  permits  giving  a 
Euclidean  description  of  all  bases  orthonormal  in  the  Minkowski 
metric  under  consideration.  Namely,  the  arbitrary  orthonormal  ba¬ 


sis  is  composed  of  the  vectors  ev,  e2',  the  endpoints  of  which  lie 
on  the  hyperbolas  (x1)2—  (x2)2  =  ±  1  symmetrically  relative  to 
one  of  their  (common)  asymptotes  (Fig.  48). 


5.  Now  consider  ^-orthogonal  matrices  (n  =  2,  k  —  1)  corres¬ 
ponding  to  a  two-dimensional  Minkowski  metric.  Write  down  any 

|  a  p 

one  of  the  matrices  in  the  form  P  = 


^-orthogonality  we  have  in  this  case 


Y  6 


By  the  definition  of 


whence 


(I  p 

1  0 

a 

Y 

| 

1 

0 

Y  6 

0  -1 

P 

6 

1 

0 

-1 

a2  —  P2  =  1, 

Yu  —  6p  =  0, 


aY  —  Pb  : 
V2-62: 


(3) 


We  find  the  general  solution  of  this  system  (three  equations).  Be¬ 
cause  of  the  second  equation  of  (3)  we  can  write 


Y  =  A.p,  6  —  7,a 


(4) 


THE  GROUP  OF  HYPERBOLIC  ROTATIONS 


28t 


I  8] 

where  X  is  the  new  unknown.  Substituting  these  expressions  into 
the  last  equation,  we  get 

Y2  -  62  =  A2  (p2  -  a2)  =  -  X2  =  -  1 

Thus,  X  =  ±  1.  On  the  other  hand. 

Hence,  to  the  values  A  =  ±  1  there  correspond  transformations 
of  the  basis  with  orientation  preserved  or  disrupted. 

Now  make  use  of  the  equation  a2  —  p2  =  1.  Its  general  solution 
is  of  the  form 

a=±cosh0,  p  =  ±sinh0  (5) 

where  0  is  an  arbitrary  parameter.  However  we  can  assume  that 
if  a  >  0,  then 

a  =  cosh0,  P  =  sinh0,  —  oo  <  0  <  -f  °°  (5a) 

and  if  a  <  0,  then 

a  =  —  cosh  0,  p  =  —  sinh0,  —  oo  <  0  <  -j-  oo  (5b) 

since  the  remaining  cases  of  (5)  reduce  to  (5a)  or  (5b)  via  a  re¬ 
placement  of  0  by  —0. 

Noting  that  X  =  ±  1,  we  see  that  formulas  (4),  (5a)  and  (5b) 
yield  all  the  solutions  of  (3).  We  have  thus  found  all  the  ^-ortho¬ 
gonal  matrices  for  n  —  2,  k  =  1. 

By  subjecting  the  original  basis  elt  e2  to  a  transformation  with 
an  arbitrary  A-orthogonal  matrix,  we  obtain  a  new  basis  ey,  ey: 

ey  =  ae,  +  pe2,  | 

ey  =  \et  -f  6<?2  J  (6) 

whose  vector  ey  is  a  unit  vector: 

II  £i '  II2  =  a2  —  p2  =  -f  1 

and  the  vector  ey  is  an  imaginary-unit  vector: 

II  ey  ||2  =  Y  '  —  62  =  —  1 

from  which  it  is  evident  that  the  transformation  (6)  cannot  carry 
a  vector  whose  endpoint  lies  on  one  of  the  hyperbolas 
(x1)2 — (x2)2  —  ±  1  into  a  vector  with  endpoint  on  the  other  hyper¬ 
bola. 

6.  We  now  elucidate  the  Euclidean  geometric  meaning  of  the 
conditions  a  >  0  and  a  <  0.  Since  the  basis  eu  e2  is  orthonormal 


282 


SPACES  WITH  QUADRATIC  METRIC 


(CH.  VHt 


in  the  Euclidean  metric  of  the  plane,  it  follows  that,  taking  Eucli¬ 
dean  scalar  products,  we  get 


(ey,  ei)  =  a(eu  ex)  -f  p(e2,  ex)  =  a  (7) 

Thus,  for  a  >  0  the  vectors  ey  and  ex  constitute  an  acute  angle, 
for  a  <  0,  an  obtuse  angle.  We  then  conclude  that  if  a  >  0,  then 
the  vector  ey  has  its  endpoint  on  the  same  branch  of  the  hyperbola 
(x1)2 — (x2)2  =  1  as  vector  ex  (the  right  branch  in  Fig.  48);  if 
a  <  0,  then  the  endpoints  of  the  vectors  ex,  ey  lie  on  different  bran¬ 
ches  of  this  hyperbola. 

7.  As  in  (7)  we  can  express  (J,  y.  6  in  terms  of  the  Euclidean 
scalar  products  of  the  vectors: 


P  =  ( ey ,  e2),  Y  =  (ey,  ex),  6  =  ( ey ,  e2) 


Figs.  49  and  50  show  different  arrangements  of  basis  ey,  ey  de¬ 
pending  on  the  signs  of  X,  a,  and  p. 

8.  Now  consider  the  matrix  P  for  X  =  +  1,  a  >  0.  By  the 
foregoing, 

cosh  0  sinh  0 

^  sinh  0  cosh0  ^ 


In  making  the  transition  from  an  orthonormal  (in  the  Minkowski 
metric)  basis  ex,  e2  to  a  new  basis  via  matrix  P  of  the  form  (8)  we 
obtain  an  orthonormal  basis  eX',  ey  with  the  same  orientation  as 
ex,  e2,  besides,  the  terminal  points  of  the  vectors  eX',  ey  lie  on  the 
same  branches  of  the  hyperbolas  (x1)2 — (x2)2=  ±  1  as  do  the  ter¬ 
minal  points  of  the  corresponding  vectors  eXt  e2.  Matrices  of  the 
type  (8)  play  the  same  role  in  two-dimensional  Minkowski  geo¬ 
metry  as  the  matrices  (4),  Section  7,  do  in  two-dimensional  Eucli¬ 
dean  geometry. 

9.  Matrices  of  type  (8)  constitute  a  subgroup  of  the  whole  ^-or¬ 
thogonal  group  for  k  —  1,  n  =  2.  Indeed,  we  have  the  equations 


cosh0,  sinh  0, 
sinh  Oi  cosh0. 


cosh  02  sinh  02 
sinh  02  cosh02 


cosh(0!+02)  sinh  (0, -f-02) 
sinh  (0|-f02)  cosh(0,+02) 


cosh  0 

sinh  0 

-I 

cosh  (—  0) 

sinh  ( —  0) 

sinh  0 

cosh  0 

sinh  (—  0) 

cosh  (—  0) 

These  equations  are  readily  established  and  they  signify  that  the 
product  of  matrices  of  type  (8)  and  the  inversion  of  a  matrix  of 
type  (8)  lead  to  matrices  of  the  same  type. 


284 


SPACES  WITH  QUADRATIC  METRIC 


[CH.  VIII 


10.  To  a  transformation  of  the  basis  via  matrix  P  corresponds 
a  transformation  of  components  with  matrix  Q—(P*)~l: 

xr  —  x1  cosh  0  —  x2  sinh  0,  |  « 

xr  —  —  x1  sinh  8  +  x2  cosh  0  )  ^ 

On  the  other  hand,  as  in  Subsection  5,  Section  7,  we  can  consider 
that  the  basis  eu  e2  does  not  change  but  that  the  formulas  (9) 
associate  to  an  arbitrary  vector  x  =  x'e\  -}-  x2e2  a  new  vector 

x'  —  x‘'e,  -f  x2'e2.  In  this  sense,  formulas  (9)  constitute  a  compo¬ 

nent  representation  relative  to  the  basis  e2  of  some  linear  trans¬ 
formation  of  the  plane.  We  denote  it  by  Hg.  Relative  to  the  Min¬ 
kowski  metric,  this  is  an  isometric  transformation  similar  to  the 


Fig.  51 


rotation  x'  —  Igx  of  the  plane  with  a  Euclidean  metric.  For  this 
reason,  the  transformation  x'  =  Hgx  is  called  a  hyperbolic  rotation 
of  the  plane.  The  term  “hyperbolic”  is  due  to  the  fact  that  for  a 
fixed  x  and  varying  0  the  terminal  point  of  the  vector  x'  =  Hgx 
slides  along  the  hyperbola  (x1)2— -(x2)2=  ||  x  ||2  (Fig.  51). 

Let  an  arbitrary  geometrical  figure  W  be  given  in  the  plane.  The 
hyperbolic  rotation  Hg  carries  it  into  a  new  figure  W'.  It  is  taken, 
by  definition,  that  the  figures  W  and  W  are  congruent  in  the  Min¬ 
kowski  metric.  In  the  Euclidean  metric  they  are,  generally  speak¬ 
ing,  not  congruent.  An  elementary  example  is  shown  in  Fig.  52, 
where  the  figure  and  its  image  are  designated  by  W  and  W  and 
are  shown  hatched. 

It.  We  will  now  see  that  the  analogy  between  a  hyperbolic  and 
an  ordinary  (Euclidean)  rotation  is  very  far-reaching.  With  this 
purpose  in  mind,  we  now  decipher  the  geometric  significance  of  the 
parameter  0  in  the  hyperbolic  rotation  x'  =  Hgx.  For  the  sake  of 
simplicity,  suppose  that  x—  {x1,  x2}  is  a  unit  vector,  that  is,  its 
terminus  lies  on  the  hyperbola  (x1)2  —  (x2)2=  1.  Let  S(0)  be  the 
area  of  the  curvilinear  triangle  bounded  by  the  vectors  x,  x'  =  Hgx 
and  by  the  arc  of  the  hyperbola  between  their  termini  (Fig.  53). 


THE  GROUP  OF  HYPERBOLIC  ROTATIONS 


285 


§  8! 

We  will  say  that  S  >  0  if  the  rotation  from  x  to  x'  is  counterclock¬ 
wise,  and  S  <  0  otherwise.  Let  x"  =  H&&x',  AS  be  the  increment 
in  area  5  (0)  when  passing  from  x'  to  x",  and  A o  the  oriented  area 


Fig.  52  Fig.  53 

of  the  parallelogram  constructed  on  the  vectors  x'  and  x"  (Fig.  53). 
Then 

x1'  x2' 

Aa  =  x2„  —  —  (jf'')2sfnh A0  +  (*2,)2sinh A0  =  —  sinhAG 

Here  we  make  use  of  formulas  (9)  with  0  replaced  by  A0  and  the 
equation  of  the  hyperbola  (a:1')2  —  (*2')2  =  1.  On  the  other  hand, 

AS  «  y  Aa  =  —  -j  s  inh  A0  «  —  j  A0 

where  the  approximate  equations  occur  up  to  quantities  of  higher 
order  relative  to  A0,  whence 

efS  =  _±rf0 

Thus,  if  we  take  into  account  that  S  =  0  for  0  =  0,  then  we  get 
the  equation 

0  =  —  25 

Note  that  when  an  ordinary  (Euclidean)  rotation  of  the  plane  takes 
place  through  the  angle  0,  then  the  terminus  of  every  unit  vector 
slides  along  the  unit  circle  and  the  angle  0  is  equal  in  absolute 
value  to  twice  the  area  swept  out  by  the  turning  vector.  As  can  be 
seen  from  the  calculations  just  made,  a  similar  situation  arises  in 


286  SPACES  WITH  QUADRATIC  METRIC  [CH.  VIII 

a  hyperbolic  rotation.  The  terminus  of  the  unit  (or  imaginary-unit) 
vector  slides  along  one  of  the  hyperbolas  (x1)2 — (x2)2=  ±  1  and 
sweeps  out  an  area  equal  in  absolute  value  to  one  half  the  “hyper¬ 
bolic  angle”. 

12.  It  is  immediately  apparent  from  (9)  that  the  hyperbolic  ro¬ 
tation  has  eigenvectors  directed  along  (common)  asymptotes  of 
the  hyperbolas  (a:1)2 — (x2)2=  ±  1.  Indeed,  if  the  vector  x  — 


Fig.  54 


=  {x1,  X2}  lies  on  the  first  asymptote,  it  follows  that  x1  =  x2  and 
then  x1'  =  x2'.  Thus,  for  the  first  asymptote,  x'  —  Hqx  =  XiX, 
where,  as  is  readily  seen, 

A,,  =  cosh  0  —  sinh  0 

Similarly  for  the  second  asymptote  x  =  {x1,  — x2}  and  x'  =  Hex  = 
=  X2x,  where 

X2  —  cosh  0  -f  sinh  0 
An  essential  point  is  that 

\\X2  =  1  (10) 

For  0  >  0  we  have  Xi  <  1,  X2  >  1;  in  this  case  the  plane  shrinks 
Xi  times  to  the  straight  line  x2  =  —  X\  and,  due  to  (10),  stretches 
the  same  number  of  times  in  the  orthogonal  direction  away  from 
the  straight  line  x2  =  x1,  as  shown  in  Fig.  54  for  a  number  of 
points.  Here,  all  the  points  of  the  plane  not  lying  on  invariant 
straight  lines  x2  =  ±  x1  slide  along  the  hyperbolas  (x1)2 — (x2)2= 
=  constant,  which  is  clearly  evident  (because  of  (10))  irrespective 
of  the  arguments  of  the  preceding  subsections. 


TENSOR  ALGEBRA 


287 


§  9) 

If  0  <  0,  then  the  directions  of  stretching  and  compression 
change  places  as  compared  with  the  case  0  >  0,  and  the  direction 
of  motion  of  the  points  along  the  hyperbolas  is  reversed. 

13.  Note  in  conclusion  that  on  one  and  the  same  Euclidean  plane 
it  is  possible  to  specify  an  infinite  number  of  distinct  Minkowski 
metrics.  To  each  there  corresponds  its  own  pair  of  conjugate  hyper¬ 
bolas  for  the  unit  circle;  conversely,  any  pair  of  conjugate  hyper¬ 
bolas  serves  as  a  unit  circle  for  some  (quite  definite)  Minkowski 
metric. 

If  the  hyperbolas  constituting  the  unit  circle  of  a  Minkowski 
metric  are  not  equilateral,  the  aforementioned  Euclidean  characte¬ 
ristic  of  orthogonality  of  two  vectors  in  the  sense  of  Minkowski 
(their  symmetry  relative  to  one  of  the  asymptotes)  is  no  longer 
valid.  More  general  (and  true  in  all  cases)  is  the  following  con¬ 
tention:  two  vectors  are  orthogonal  in  a  given  Minkowski  metric 
if  and  only  if  they  extend  along  the  two  conjugate  diameters  of 
the  hyperbolas  making  up  the  unit  circle  in  that  metric.  We  omit 
the  proof  of  this  assertion. 

By  giving  different  Minkowski  metrics  to  the  plane,  we  obtain 
distinct  quadratic-metric  spaces.  They  are  of  course  all  metrically 
isomorphic  to  one  another.  From  the  algebraic  standpoint,  their 
geometries  are  identical  since  their  subject  matter  consists  of  the 
invariants  of  one  and  the  same  ^-orthogonal  group. 

§  9.  Tensor  algebra  in  quadratic-metric  spaces 

1.  We  again  consider  quadratic-metric  spaces  of  arbitrary  di¬ 
mension  and  give  a  description  of  tensor  algebra  in  such  spaces. 
To  be  more  exact,  we  will  indicate  special  propositions  of  tensor 
algebra  connected  with  the  presence  of  a  metric.  Note  that  the  ten- 
sorial  apparatus  proves  to  be  useful  in  many  problems  of  the  theory 
of  quadratic-metric  spaces,  particularly  in  cases  where  the  circum¬ 
stances  compel  the  use  of  arbitrary  (nonorthonormal)  bases. 

2.  Let  Rn  be  an  n-dimensional  linear  space  with  a  specified 
metric  form 

II  *  IP  (*,*)  =  £*/**'**  (1) 

where  *  =  in  an  arbitrary  basis  eu  . . . ,  e„. 

Accordingly,  for  the  scalar  product  we  have 

(x,  y)  =  g  (x,  y)  =*  £  gikXlyk  (2) 

The  tensor  of  the  quadratic  form  (1)  or,  what  is  the  same,  the  bili¬ 
near  form  (2),  is  called  the  metric  tensor  of  the  space  Rn .  This  is 
a  second-order  covariant  symmetric  tensor  whose  components  re- 


288 


SPACES  WITH  QUADRATIC  METRIC 


[CH.  Vlll 


lative  to  the  basis  e\,  . . . ,  en  are  given  by  the  multiplication  table 
of  the  basis  vectors: 

(eit  ei)  —  gil 

3.  By  G  —  ||  gih  II  we  denote  the  matrix  of  the  metric  tensor  in 
the  basis  e\,  ....  e„,  that  is,  the  matrix  of  the  form  (2).  Because 
of  the  nonsingularity  of  the  form  g(x,  y)  we  have:  det  G  =#=  0. 
Thus,  there  exists  an  inverse  matrix  G~l.  The  elements  of  G_l  are 
indicated  by  the  standard  procedure— g  with  superscripts: 

By  the  definition  of  an  inverse  matrix  we  have 

=  (3) 

4.  Theorem  1.  The  quantities  gih  are  the  components  of  a  second- 
order  contravariant  tensor. 

Remark.  The  assertion  of  the  theorem  means  that  when  passing 
to  a  new  basis  we  have  the  transformation  law 

grk'  =  2.gikQll'Ql'  (4) 

where  gl'h'  are  elements  of  a  matrix  inverse  to  ||  of  the 
covariant  metric  tensor  relative  to  the  new  basis.  Equation  (4) 
should  be  derived  as  a  consequence  of  the  transformation  law 

gw  =  I  Sikpii'pk'  (4a) 

which  we  know  together  with  the  definition  of  the  metric  tensor. 
However,  it  is  technically  difficult  to  derive  (4)  from  (4a),  and  so 
the  following  proof  of  the  theorem  is  based  on  an  earlier  described 
characteristic  of  tensor  quantities  (see  Chapter  V,  Section  4). 

Proof.  Consider  the  space  Rh  which  is  conjugate  to  the  space 
R„,  and  in  it  a  basis  el,  . . . ,  en,  which  is  reciprocal  to  the  basis 
e . .  e„  e=  R». 

Construct  the  transformation  u  =  G(x),  which  to  each  vector 

x  —  X  xkek  e  Rn  associates  a  vector  u  =  YjUiel  e  R'n  via  the  for¬ 
mula 

«/  =  E  gikXk  (5) 

Since  the  orders  are  in  agreement  here  (in  the  right  member  we 
have  a  contraction  of  a  second-order  covariant  tensor  with  a  first- 
order  vector),  the  transformation  u  =  G(x)  is  specified  inva- 
rianlly. 

On  the  other  hand,  due  to  the  fact  that  det  G  ^  0,  the  image  of 
space  Rn  is  the  entire  space  R'n.  This  means  that  for  every  vector 


§9] 


TENSOR  ALGEBRA 


289 


u  e  Rn  there  is  a  vector  x  e  R„  such  that  u  —  G(x).  This  vector  x 
is  uniquely  defined  by 

xl  =  Z  glkuk  (6) 

To  obtain  (6)  it  suffices  to  solve  the  system  of  equations  (5), 
regarding  the  xh  as  unknown  numbers  and  the  «,  as  known. 

Formula  (6)  shows  that  when  gi,{  is  contracted  with  an  arbitrary 
covariant  vector  «/, ,  the  result  is  a  first-order  contravariant  tensor. 
By  a  familiar  characteristic  of  tensorial  quantities  we  conclude 
that  gih  is  a  tensor  whose  total  order  corresponds  to  the  arrange¬ 
ment  of  the  indices.  The  proof  of  the  theorem  is  complete. 

5.  The  tensor  gih  is  termed  the  contravariant  metric  tensor. 

6.  It  is  convenient  in  quadratic-metric  spaces  to  make  use  of 
what  is  called  the  reciprocal  basis  of  the  given  basis. 


Definition.  The  bases  eu  . . . ,  en  and  e1 . en  in  Rn  are  called 

reciprocal  bases  if  (e‘,  e/)— 6/. 

Remark.  In  contrast  to  Chapter  V,  here  the  given  basis  and  the 
reciprocal  basis  are  taken  in  the  same  space.  Later  on  it  will  be 
shown  that  the  concept  of  reciprocal  bases  in  a  single  quadratic- 
metric  space  actually  reduces  to  that  of  reciprocal  bases  lying  in 
the  given  space  and  in  the  conjugate  space. 

Fig.  55  depicts  reciprocal  bases  et,  e2  and  e1,  e2  in  a  plane  with 
the  ordinary  Euclidean  metric.  By  the  definition  of  reciprocal  bases, 
we  have  four  conditions  in  this  case: 

(e',el)=l,  (e1,  e2)  —  0, 

(e2,  <?,)  =  0,  (e'\  e,)  =  1 

From  them  it  follows  that  the  vector  e 1  is  perpendicular  to  the 
vector  e2>  and  the  vector  e 2  is  perpendicular  to  the  vector  e,;  be¬ 
sides,  since  (e',ei)>0  and  (e2,  e2)  >  0,  it  follows  that  e1,  e{  and 
also  e2,  e2  form  acute  angles.  To  take  precise  account  of  the  condi¬ 
tions  (e1,ei)=  1  and  (e2,  e2)  =  1  in  the  figure  requires  specifying 


10-«» 


290 


SPACES  WtTH  QUADRATIC  METRIC 


[CH.  Vtll 


a  scale  unit  (these  conditions  determine  the  lengths  of  the  vec¬ 
tors  e\  e 2  from  the  given  vectors  e\ ,  e2).  In  the  two-dimensional 
Euclidean  case  it  is  clear  geometrically  that  the  given  basis  uni¬ 
quely  determines  the  reciprocal  basis.  At  the  same  time  the  fol¬ 
lowing  general  theorem  holds  true. 

Theorem  2.  For  an  arbitrary  given  basis  e\,  . ..,  en  in  /?„  the 
reciprocal  basis  el . en  is  always  determined  uniquely. 

Proof.  The  vectors  of  the  desired  reciprocal  basis  can  always  be 
represented  as 

ek  =  ZAkaea  (7) 

where  Aha  are  unknown  numerical  coefficients. 

Form  the  scalar  product  of  e,  into  the  right  and  left  members 
of  (7).  Noting  that 

(ek,  e,)  =  6*,  (ea,  e,)  =  gal  (8) 

we  get 

of -I  **■«.. 

The  product  of  an  unknown  matrix  A  =  ||  Akix  ||  by  a  known  non¬ 
singular  matrix  G  yields  the  unit  matrix  E  =  ||6?f«  whence  A  — 
=  G_1,  that  is  Aha  =  gha.  We  thus  get  the  only  possible  equations 

ek  =  £  gkaea  (9) 

Since  det  (gha)  =/=  0,  the  vectors  e' . en  defined  by  (9)  are 

linearly  independent  and,  hence,  constitute  a  basis.  It  remains  to 
see,  via  direct  verification,  that  this  basis  is  indeed  reciprocal  to 
the  given  basis.  We  have 

( ek ,  et)  =  Z  gka  {ea,  et )  =  £  gkagai  =  6f 
which  is  what  is  required,  and  this  completes  the  proof. 

7.  Inverting  (9)  we  get 

ek  =  Z  ghaen  10) 


8.  Forming  the  scalar  product  of  ei  into  (9),  we  get 

(e‘,  ek )  =  £  gka  ( e‘  ,ea)  =  Y  gk'l6a 

whence  we  obtain  a  multiplication  table  for  the  vectors  of  the  re¬ 
ciprocal  basis: 

(e‘,  ek)  =  glk  (11) 


The  right  members  of  this  table  yield  the  components  of  the 
contravariant  metric  tensor. 


§91 


TENSOR  ALGEBRA 


291 


9.  Let  ( X ,  y)  be  arbitrary  vectors  in  /?„.  Expand  them  in  terms 
of  the  reciprocal  basis;  the  components  with  respect  to  this  basis 
will  (here  and  henceforth)  be  indicated  by  lower  indices.  Forming 
the  scalar  product  of  x  by  y  and  using  formulas  (11),  we  get 

(v,  i/)  =  £&'%'/*  (12) 

At  the  same  time  we  have 

IUII2=Eg'V*  H3) 

Formulas  (12)  and  (13)  are  reciprocals  of  (2)  and  (1). 


10.  Given  two  vectors  x,  u  e  R„ .  Expand  x  in  terms  of  the  basis 
C\,  . . . ,  cn . 


X  —  E  X‘ei 


Expand  u  in  terms  of  the  reciprocal  basis  e\  . . . ,  en: 

«  =  E  ukek 


Form  the  scalar  product 

(«,  x)  =  E  (<?*.  et)  =  E  ukx'bki  =  E  =  « \x'  +  . . .  +  unxn 

We  see  that  when  the  vector  x  is  expressed  in  terms  of  components 
relative  to  the  given  basis  and  the  vector  u  in  terms  of  components 
relative  to  the  reciprocal  basis,  then  the  scalar  product  («,  x)  is 
expressed  as  a  contraction.  The  vectors  u  and  x  can  naturally  be 
interchanged,  in  which  case  we  have 

(«,  x)  =  u'x,  unxn 

11.  In  the  space  R„,  to  each  linear  form  u(x)  there  uniquely  cor¬ 
responds  a  vector  u  e  Rn  such  that  w(x)  =  (w,x).  In  other  words, 
every  linear  form  in  Rn  is  uniquely  representable  as  a  scalar  pro¬ 
duct.  Indeed,  relative  to  the  basis  eit  . . . ,  en  we  have 

«(x)  =  «,x1  +  ...  +unxn 

where  uu  ...,  un  are  quite  definite  coefficients  of  the  form  u(x), 
whence  u(x)  =  (u,  x),  where  u  =  E  »/<•*'  (ol,...,en  is  the  reci¬ 
procal  basis). 

12.  The  linear  forms  of  the  space  Rn  are  elements  of  the  con¬ 
jugate  space  R*„.  By  the  preceding  subsection,  every  form  «(x)e/?^ 
is  associated  with  a  vector  u^R„  such  that  u(x)  =  (u,  x). 

Clearly,  this  is  a  one-to-one  correspondence  between  Rn  and  Rn- 
It  is  also  easy  to  see  that  it  is  a  linear  isomorphism  as  well. 


10* 


292 


SPACES  WITH  QUADRATIC  METRIC 


[CH.  VIII 


Indeed,  if  «(*)  =  («,  x)  and  v(x)  =  (t>,  x),  then  a (jc)  -|-  t; (ac)  = 
=  («  -\-v,  x)  and  olu(x)  —  (a«,  x),  where  a  is  any  scalar. 

13.  Now  we  need  not  distinguish  between  Rn  and  R„  if  the 
elements  of  R'n  are  replaced  by  their  images  in  Rn  under  the  iso¬ 
morphism  just  indicated.  Then  every  vector  of  Rn  is  also  a  vector 
of  R'n.  Accordingly,  we  say  that  a  quadratic-metric  space  is  a  self¬ 
conjugate  space  (Rn  =  R'n). 

If  x  and  u  are  vectors  of  Rn  but  one  is  regarded  as  a  vector  of 
Rn  and  the  other  is  viewed  as  a  vector  of  R'„  then  their  contrac¬ 
tion  in  the  sense  of  Section  2,  Chapter  V,  coincides  with  the  scalar 
product  (u,  x );  in  this  case  we  must  assume  that  «],...,  e„  e  Rn 
and  e\  . . .  ,e*  e  R'n. 

14.  Suppose  a  given  basis  eit  . . . ,  en  transforms  via  matrix  P  by 
the  formulas 

=  (14) 

If  we  introduce  the  basis  er,  ....  en',  which  is  reciprocal  to 

the  new  basis  er,  ....  e,r,  then 

«''  =  ZQ'V  (15) 

where,  as  usual,  Q==||Q|  ||  =  (P*)~‘.  There  is  no  need  to  prove  (15) 
because,  by  Subsection  13,  the  defining  of  reciprocal  bases  in  Rn 
reduces  to  defining  the  reciprocal  bases  in  Rn  and  R‘n.  Therefore, 
in  order  to  establish  formulas  (15)  it  suffices  to  refer  to  the  results 
of  Section  1,  Chapter  V. 

15.  Let  x  be  an  arbitrary  vector  in  Rn.  We  can  expand  it  either 

in  terms  of  the  basis  e{ . en  or  the  reciprocal  basis  en : 

x  =  Yj  x‘ei  —  X  xk^k  (16) 

From  (14)  and  (15)  it  follows  that  in  passing  to  a  new  basis 
the  components  x‘  transform  by  the  contravariant  law  and  the  com¬ 
ponents  xh  transform  by  the  covariant  law: 

=£Q!  X,  xk'  =  Z  Pk'Xk 

For  this  reason,  xi  and  xh  are  respectively  termed  the  contravariant 
and  covariant  components  of  the  vector  x.  From  (9),  (10)  and  (16) 
follow  the  formulas 


xk=Zgktxi, 

x‘  —  Z  gikxk 


(17) 

(18) 


§91 


TENSOR  ALGEBRA 


293 


which  express  (relative  to  the  given  basis)  the  covariant  compo¬ 
nents  of  a  vector  in  terms  of  its  contravariant  components,  and 
also  the  contravariant  components  in  terms  of  the  covariant  com¬ 
ponents.  As  for  the  vector  x  itself,  we  can  regard  it  with  equal 
justification  as  contravariant  or  covariant  (since  Rn  —  Rn). 

From  what  has  been  said  in  this  subsection  it  follows  that  every 
first-order  tensor  (either  contravariant  x’  or  covariant  Xu)  may  be 
invariantly  represented  as  a  vector  in/?,,  or  £  xkek).  The 

first-order  tensors  x‘  and  xi,  represent  one  and  the  same  vector  in 
Rn  if  and  only  if  they  are  related  by  the  condition  (17)  or  (18) 
(which  one  is  immaterial  since  (18)  follows  from  (17)  and  con¬ 
versely). 

16.  It  is  easy  to  see  that  in  /?„  multiple-order  tensors  can  also 
be  specified  at  pleasure  by  covariant,  contravariant  or  mixed  com¬ 
ponents.  For  instance,  let 

a  —  Yj 

be  a  contravariant  tensor  of  order  two,  that  is  an  element  of  the 
tensor  product  /?„  <8>  /?„.  Replacing  eh  in  accord  with  (10),  we 
find 

a  —  £  a‘ketek  =  £  aikeigkaea  =  £  aikgk<xeiea  =  £  a^g^e^ 

Set 

«!  =  £«%  (19> 

Then  the  same  tensor  can  be  represented  as 

Zi  k 

a*e,e 

which  is  an  element  of  the  tensor  product  /?„  ®  Rn.  Formulas  (19) 
are  expressed  in  words  as  follows:  the  second  upper  index  of  ten¬ 
sor  a  is  lowered  by  means  of  the  metric  tensor.  From  (3)  and  (19) 
follows 

ik  v  J 
a  —L,ciag 

Here  the  lower  index  has  been  raised. 

Some  tensor  calculations  call  for  a  good  deal  of  such  “juggling” 
of  indices  (raising  and  lowering  of  indices).  In  such  operations, 
it  is  common  to  fix  the  original  positions  of  indices  by  dots.  For 
example,  in  (19)  a[k  is  used  instead  of  a‘k  to  emphasize  that  the 
second  index  is  lowered  and  the  first  index  remains  a  superscript. 

17.  We  have  already  mentioned  that  gih  and  gih  are  covariant 
and  contravariant  metric  tensors  respectively.  However,  due  to 
formulas  (3)  and  (10)  we  have  the  tensor  equation 

£  gikje"  =  £  fr"V* 


294 


SPACES  WITH  QUADRATIC  METRIC 


ICH.  VIII 


And  so  it  is  best  to  say  that  there  is  one  metric  tensor  and  that 
gih  and  gih  are  its  covariant  and  contravariant  components. 


18.  Now  let  it  be  given  that  Rn  has  a  positive  index 
k  Let  e,,  ....  e„  be  an  orthonormal  basis  in 

provided  that  the  first  k  vectors  are  unit  vectors  (and  the  others 
are  imaginary-unit  vectors).  Then 


1 


0 


G -I! glk  !!=£*  = 


(20) 


0 


(see  Section  3,  Subsection  1),  whence 

G",  =  |g',l=£*  =  G  (2D 

Due  to  (21)  the  basis  e 1 . en ,  which  is  the  reciprocal  of  the 

basis  e\,  . . . ,  en,  is  also  orthonormal  and  the  first  k  vectors  are 
unit  vectors  too.  Besides,  from  (9)  and  (20),  or  from  (10)  and 
(21),  follow  the  relations 

e‘  —  eh  /  —  1,2 . k\ 

el  —  —  et.  i  =  k  +  1 ,  . . . ,  n 


Thus,  the  unit  vectors  of  reciprocal  orthonormal  bases  corres¬ 
pondingly  coincide  and  the  imaginary-unit  vectors  differ  in  sign. 
Besides  that,  by  formulas  (17)  and  (18)  we  have 

x‘  —  xh  /=1,2,  .  . .,  k; 
xl  =  —  xh  i  —  k-\-\,...,n 

Similar  equations  apply  for  tensors  of  any  order.  We  will  write 
them  down  in  the  particular  case  of  a  tensor  of  order  three  whose 
first  index  is  lowered: 


=  i  =  l,2 . k\ 

a}1.1.  —  —  a111,  i  =  k 

19.  In  the  next  section  we  will  give  some  examples  of  the  appli¬ 
cation  of  tensor  algebra  in  quadratic-metric  spaces. 


EQUATION  OE  A  HYPERPLANE 


29S 


$  10] 

§  10.  The  equation  of  a  hyperplane  in  quadratic-metric  space 

1.  Here  we  consider  affine  quadratic-metric  space,  that  is,  an 
affine  space  91  that  corresponds  to  a  linear  space  L  with  quadratic 
metric  (see  Section  2,  Subsection  5).  In  91  let  us  specify  a  system 
of  affine  coordinates  with  arbitrary  origin  and  basis  t»i,  .  .  . ,  en.  In 
this  coordinate  system  we  specify  the  equation  of  a  hyperplane: 

•  -f-  . . .  -f-  Anxn  +  C  — 0 

or,  briefly, 

ZAkxk  +  C  =  0  (1) 

In  passing  to  a  new  basis  (with  origin  preserved)  we  have 

k  v  n*  *' 

x  =  2-,  Pk'X 

whence 

Z  Akxk  +  C  =  Z  AkPlxk'  +  C  =  Z  +  c  (2) 

where 

/It'  =  Z  AkPk'  (3) 

The  last  expression  in  the  chain  of  equations  (2)  is  the  left-hand 
member  of  the  equation  of  the  hyperplane  in  the  new  coordinate 
system.  From  (2)  and  (3)  it  follows  that  there  is  invariantly  as¬ 
sociated  with  the  left  member  of  the  equation  of  the  hyperplane 
a  vector 

n  =  {A . .  An) 

with  covariant  components  A\ . A„,  or 

n  d|{?'  -f-  ...  -I-  An£n 

where  e\  . . . ,  en  is  the  reciprocal  basis  of  the  given  basis  et . <?„. 

As  for  the  running  coordinates  x\  . . . ,  x"  of  an  arbitrary  point  in 
the  plane,  they  are,  by  definition,  contravariant,  since  they  are  the 
coordinates  of  the  radius  vector  of  that  point  relative  to  the  given 
basis: 

x  —  x,el  +  ...  +  x"ea 

From  the  foregoing,  it  is  clear  that  the  left-hand  member  of  the 
equation  of  the  hyperplane  may  be  written  in  invariant  form  via 
the  scalar  product 

(«,  *)  -f  C  —  0  (4) 

2.  If  (a^,  ....  *")  is  a  fixed  point  of  the  hyperplane  and  x0  is  its 
radius  vector,  then  C  ==  —  (n,  ,v0)  and  (4)  becomes 

(n,  x  —  -Y(j)  =  0 


(5) 


2% 


SPACES  WITH  QUADRATIC  METRIC 


[CM.  VIII 


or,  expanded, 

\  (*'  -  x'o)  +  •  •  •  +  K  ( * 1  -  *o)  =  0 

From  (5)  it  follows  that  the  vector  n  is  orthogonal  to  any  vector 
x  —  x0  lying  in  the  hyperplane.  Thus,  the  vector  n  is  a  normal  to 
the  hyperplane  given  by  equation  (1). 

The  contravariant  components  of  the  normal,  that  is,  the  com¬ 
ponents  of  n  relative  to  the  given  basis  et,  . . . ,  en  are  given  by 

^  (6) 

where  gih  is  the  metric  tensor. 

3.  Problem.  Given  in  two-dimensional  space,  in  some  coordinate 
system,  the  metric  form 

||  x  ||2  =  2  (x1)2  +  2x'x2  +  (x2)2 

the  straight  line  (one-dimensional  plane)  3*1  +  4x2  +10  =  0,  and 
the  point  A(  1,  1) . 

Find  the  foot  of  the  perpendicular  dropped  on  the  given  straight 
line  from  point  A  in  the  given  metric. 

2  1 

Solution.  We  have  the  matrix  of  the  matric  form:  G  —  j  1 

whence  G~ 1  =  _  ^  ,  and  so  g11  =  1,  g12  =  g21  =  — 1, 

g22  =  2.  From  the  equation  of  the  straight  line  we  find  its  normal 
n  =  {3,  4}  relative  to  the  basis  e\  e2.  To  obtain  the  components  of 
the  normal  relative  to  the  given  basis  eh  e2,  use  formulas  (6): 

A'  =  g"Al  +  g'M2  =  -  1,  A2  =  g2'A{  4  g-2+  =  5 

From  this  we  obtain  (in  the  given  coordinate  system)  the  equa¬ 
tion  of  the  perpendicular  to  the  given  straight  line  passing  through 
>1(1.1): 

.v1  -  1  x*-\ 

-1  5 

Solving  this  equation  simultaneously  with  the  equation  of  the 
straight  line  MN,  we  find  point  B( 2,  —4),  which  is  the  desired 
foot  of  the  perpendicular. 

If  in  the  plane  (jc1,  x2)  we  construct  an  ellipse  ||  x  ||  =  1  (the 
unit  circle  in  the  given  metric),  then  the  directions  of  the  straight 
line  MN  and  the  perpendicular  AB  are  conjugate  relative  to  this 
ellipse  because  of  Subsections  9,  10,  Section  7  (Fig.  56). 


EUCLIDEAN  SPACE 


297 


S  HJ 

Problem.  Given  in  five-dimensional  quasi-Euclidean  space  with 
positive  index  k  —  3  is  a  hyperplane  x1  +  x2  +  x3  -f-  x4  +  x5  =  1 
and  a  point  4(1,  1,  1,  1,  1).  Find  the  foot  of  a  perpendicular 
dropped  on  this  plane  from  A.  It  is  known  that  the  coordinate  sy¬ 
stem  is  defined  by  an  orthonormal  basis  whose  first  three  vectors 
are  unit  vectors. 

Solution.  From  the  equation  of  the  hyperplane  we  find  its  normal 
n  —  {1,  1,  1,  1,1}  relative  to  the  basis  e 1 . e5,  whence  n  = 


Fig.  56 


=  {1,  1,  1,  — 1,  —1}  relative  to  the  basis  eu  e5  (see  Section  9, 
Subsection  18).  We  thus  have  the  equation  of  the  perpendicular: 

x'  -  1  AT2  —  1  x3  -  1  x*  -  1  x5  -  1 
I  1  1  —1  —1 

Solving  them  simultaneously  with  the  equation  of  the  given 
hyperplane,  we  get  the  desired  point:  x'—  — 3,  x2=  —3,  x3=  — 3, 
x4  =  5,  x5  =  5. 

§  11.  Euclidean  space.  Orthogonal  matrices.  Orthogonal  group 

1.  Definition.  Euclidean  linear  space  is  an  n-dimcnsional  linear 
space  with  quadratic  metric,  provided  that  its  metric  quadratic 
form  g(x,  x)  is  positive  definite. 

The  term  Euclidean  is  also  used  for  an  n-dimensional  affine 
space  if  the  corresponding  linear  space  is  Euclidean. 

From  now  on  we  will  be  dealing  precisely  with  such  a  space. 
That  will  enable  us  to  consider  both  vectors  and  points. 

We  will  use  the  notation  En  for  an  n-dimensional  Euclidean 
space.  For  the  norm  of  a  vector  we  will  use  the  absolute-value 
sign:  |x|. 


298 


SPACES  WITH  QUADRATIC  METRIC 


[CH.  VIII 


2.  In  Euclidean  space  the  Cauchy-Bunyakovsky  inequality  holds: 


Therefore 


(x,  yf  <  (.v,  x)  •  (y,  y) 


(x,  y) 


Vu.  *)  V(y.  y) 


<1 


equality  occurring  if  and  only  if  the  vectors  x,  y  are  coilinear 
(linearly  dependent). 

Using  this  circumstance,  we  can  introduce  an  angle  between 
vectors.  Namely,  if  x,  y  are  nonzero  vectors,  then  the  angle  bet¬ 
ween  them  is  a  number  q>  defined  by 

cos  qp  =  — j — j — r  ,  0  ^  qp  ^  n 
1  jc  I  -  Lv  I 

Note  that  qp  =  0  if  and  only  if  the  vectors  are  coilinear  and  in 
the  same  direction,  whereas  <p  —  ji  signifies  the  vectors  have  op¬ 
posite  directions. 

Using  the  angle,  we  can  write  the  scalar  product  as  is  done  in 
elementary  vector  algebra: 

(at,  y)  —  I  x  |  •  |  y  |  •  cos  qp 


3.  In  the  orthonormal  basis  et,  . . . ,  en  we  have 
\x  |2  =  (x1)2  +  (x2)2  +  ...  +  (xn)2, 

(x,  y)  =  x'y'  +  x2y2  +  ...  +  xnyn 

Thus,  the  familiar  formulas  of  analytic  geometry  for  the  length 
of  a  vector  and  for  the  scalar  product  carry  over  directly  to  the 
multidimensional  case. 


4.  For  an  arbitrary  unit  vector  e  we  can  introduce  the  angles 
ai,  . . . ,  a„  which  it  forms  with  the  vectors  of  the  orthonormal  basis 

. en.  The  cosines  of  these  angles,  cos  ai,  ....  cos  a„,  are 

called  the  direction  cosines  of  the  vector  e  (relative  to  the  given 
basis).  It  is  easy  to  see  that 

e  =  e,  cos  a,  +  e2  cos  012  +  •  •  •  +  en  cos  a„ 

and  that 

cos2 a,  +  cos2 02  +  ...  -|-cos2an  =  l 

in  complete  analogy  with  the  familiar  relations  of  elementary  ana¬ 
lytic  geometry. 

5.  By  definition,  an  n-dimensional  Euclidean  space  is  a  quad¬ 

ratic-metric  space  with  positive  index  k  =  n.  In  En  space,  every 
orthonormal  basis  consists  solely  of  unit  vectors  (there  are  no 
imaginary-unit  vectors).  If  . en  is  an  arbitrary  orthonormal 


EUCLIDEAN  SPACE 


299 


§  H] 


basis  in  En,  then  a  new  basis 

<?*' =  D  Pft-e*  (1) 

will  also  be  orthonormal  if  and  only  if  the  matrix  P  =  || p£-||  satis¬ 
fies  the  condition  of  ^-orthogonality  for  k  —  n  (see  Section  6, 
equation  (2)).  But  when  k  —  n  the  matrix  denoted  in  Section  1 
by  the  symbol  Eh  becomes  the  unit  matrix  E.  From  this  we  con¬ 
clude  that  in  Euclidean  space  the  transformation  (1)  of  an  ortho¬ 
normal  basis  e\,  . . . ,  en  again  yields  an  orthonormal  basis  e^,  . . . 
. . . ,  en'  if  and  only  if 

PP'  =  E  (2) 

6.  Definition.  Every  n  X  n  matrix  that  satisfies  (2)  is  said  to  be 
orthogonal. 

7.  By  Section  6,  orthogonal  nX«  matrices  constitute  a  sub¬ 
group  of  the  group  of  all  nonsingular  n  X  n  matrices. 

It  is  called  the  orthogonal  group  of  n  X  n  matrices  and  will 
henceforth  be  denoted  by  O. 

8.  The  set  of  all  orthonormal  bases  in  a  given  Euclidean  space 
En  is  nothing  but  a  class  of  bases  relative  to  the  orthogonal 
group,  which  class  is  generated  by  some  one  orthonormal  basis. 

If  a  linear  space  Ln  is  given  without  a  metric,  then  all  the  bases 
in  Ln  split  into  classes  relative  to  the  group  0.  Each  of  these 
classes  may  be  regarded  as  consisting  of  orthonormal  bases  if  we 
introduce  in  L„  a  certain  definite  Euclidean  metric  corresponding 
to  precisely  that  class.  The  Euclidean  spaces  into  which  L„  is  con¬ 
verted  by  specification  in  it  of  such  metrics  are  distinct  but  metric¬ 
ally  isomorphic.  Their  geometries  are  algebraically  identical  in  the 
sense  that  the  invariants  of  one  and  the  same  group  0  constitute 
the  subject  of  investigation  in  each  instance. 

In  Section  7  we  discussed  similar  things  in  detail  for  the  two- 
dimensional  case. 


9.  Because  of  (2),  (det  P )2  =  1,  whence  for  every  orthogonal 
matrix 


det/>  =  ±  1 


Thus,  the  orthogonal  group  may  be  regarded  as  a  subgroup  of 
the  group  of  matrices  with  unit  modulus  of  the  determinant  (like 
all  ^-orthogonal  groups,  as  has  already  been  pointed  out). 

The  matrices  P  e  O  for  which  det  P  =  +  1  constitute  the  sub¬ 
group  0+  of  the  group  O. 

To  matrices  in  0*  there  corresponds  a  transformation  of  ortho¬ 
normal  bases  with  orientation  preserved,  which,  to  some  extent,  is 


300 


SPACES  WITH  QUADRATIC  METRIC 


[CH.  VIII 


analogous  to  a  rotation  of  a  (two-dimensional)  basis  in  the  Eucli¬ 
dean  plane  (see  Section  7).  Such  transformations  of  bases  in 
spaces  of  any  dimension  are  usually  termed  rotations  (about  a 
fixed  origin),  and  so  the  group  0+  is  often  called  the  rotation  group 
(also  see  Sections  7,  8,  Chapter  IX). 

10.  To  a  transformation  of  an  orthonormal  basis  in  E„  via  for¬ 
mulas  (1),  Subsection  5,  corresponds  a  transformation  of  coordi¬ 
nates 

*''=EQ‘Y  (3) 

with  matrix  Q  =  lQf|  =  (P*)-1-  Because  of  (2)  we  have 

Q  =  P 

Thus,  in  Euclidean  space  the  transition  from  one  orthonormal  basis 
to  another  via  (1)  with  orthogonal  matrix  P  is  associated  with  a 
transformation  of  coordinates  with  the  same  orthogonal  matrix: 
Q  =  P. 

11.  A  transformation  of  type  (3)  of  the  variables  x\  ....  xn 
into  the  variables  jc1',  . ..,  xn'  is  called  orthogonal  if  the  matrix 
is  orthogonal.  Orthogonal  transformations  of  variables  may  be 
described  in  purely  algebraic  terms  without  resorting  to  ortho¬ 
normal  bases  in  En.  These  are  the  transformations  of  type  (3), 
and  only  such  transformations,  for  which  the  identity 

(*’')2  +  ...  +  (x"')2^(*')2  +  ...  -H*")2 

holds  true. 

12.  The  metric  form  in  En  space,  relative  to  the  orthonormal 
basis  eIt  . . . ,  e„,  is 

I  x  P  =  (jti)2+  ...  +  (xn)2 

whence,  if  G  is  the  matrix  of  the  form  and  E  is  the  unit  matrix, 
then 

G~'  =  G  =  £ 

Thus,  gu  —  1,  gik  =  0  (i  ^  k)  and  in  exactly  the  same  way 
gu  —  1,  gih  =  0  (t  k).  For  this  reason  and  on  the  basis  of  for¬ 
mulas  (9)  of  Section  9 


which  means  that  if  the  given  basis  is  orthonormal  in  then  the 
reciprocal  basis  coincides  with  it.  At  the  same  time,  the  covariant, 
contravariant  and  mixed  components  of  any  tensor  having  identical 
indices  coincide.  For  instance, 

a‘k=  Z  = 


EUCLIDEAN  SPACE 


§  HI 


T01 


In  particular,  for  any  vector 

**==** 

(also  see  Subsection  18  of  Section  9).  For  this  reason,  if  the  nature 
of  a  problem  referring  to  Euclidean  space  is  such  that  it  is  pos¬ 
sible  to  confine  oneself  to  orthonormal  bases,  then  the  orders  of 
the  tensors  are  not  distinguished  and  all  indices  on  vectors  and 
tensors  are  writteft  as  subscripts  (as,  for  example,  in  the  theory  of 
elasticity). 

13.  We  now  indicate  a  special  notation  for  any  orthogonal 
matrix.  Imagine  that  we  have  a  table  of  the  angles  that  the  vectors 
of  a  new  orthonormal  basis  form  with  the  vectors  of  the  old  ortho¬ 
normal  basis: 


e, 

e2 

Cn 

e\‘ 

a, 

P, 

Yi 

e2' 

02 

Y2 

.  .  . 

<V 

on 

P* 

Yn 

Using  the  conventional  alphabet,  a,  p,  . . . ,  y.  we  have,  by  Sub¬ 
section  4,  the  relations 

et>  =e1cosa1  +  e2 cos p,  -j-  ...  -f  f>„  cos  Yi,  ) 


<V  =  e,cosan  +  e?2cosp„  -f  . . .  +  en cos y„  J 

This  is  an  expanded  version  of  the  formula  (1)  via  the  direction 
cosines  of  the  new  basis  vectors.  Matrix  P  has  also  been  written 
down.  Thus,  any  orthogonal  matrix  can  be  written  via  direction 
cosines  as 


COS  (X| 

COS  Pi  . 

.  cosyi 

P  = 

COS  tt> 

cosp,  . 

.  COS  Y2 

cosa„ 

cosp,,  . 

.  COSY,, 

With  respect  to  this  notation,  observe  the  following  characteristic 
property  peculiar  solely  to  orthogonal  matrices,  namely,  in  the 
case  of  an  orthogonal  matrix,  the  sum  of  the  squares  of  the  ele¬ 
ments  of  a  column  or  a  row  is  equal  to  unity  (this  is  due  to  nor- 


.102 


SPACES  WITH  QUADRATIC  METRIC 


(CH.  Vtlt 


malization  of  bases);  the  sum  of  the  products  of  corresponding 
elements  in  two  columns  or  two  rows  is  equal  to  zero  (this  is  be¬ 
cause  each  of  the  bases  is  orthogonal). 

14.  Since  P~l  =  P*,  it  follows  that 


COS  d| 

cos  a2  . 

.  cos  afl 

p~l  =-- 

cos  p. 

COS  ($2  • 

.  cos  |J„ 

cos  Yi 

COSY2  • 

•  COS  Yn 

15.  From  the  preceding  two  subsections  and  equation  Q  =  P  we 
have  formulas  for  the  transformation  of  coordinates: 

X['  =  .T|  COS  Ct|  -f-  X2  COS  pi  -f-  ...  4-  -V^COS  Y|,  ) 

Xn'  =  AT,  COS  (tn  4~  X<i  COS  -f"  •  •  •  4  Xn  COS  Y„  i 
The  inverse  formulas 

=  X\'  cos  a,  +  x,’  cos  a2  4-  •  •  •  4*  -V  cos <v,  j 


=  x{'  cos  Yi  4-  x2’  cos  y2  +  •  •  •  4-  xn'  cos  yn  J 
are  obtained  from  the  condition  Q_l  =  Q*  (or  Q~‘  =  P-1). 

§  12.  The  normal  equation  of  a  hyperplane  in  Euclidean  space 

1.  Given  in  E„  a  coordinate  system  with  an  arbitrary  basis 
e\,  ....  en.  Specified  in  this  coordinate  system  is  the  equation  of 
the  hyperplane 

Atx'+  ...  +  Anxn  +  C  =  0  (1) 

where  n  =  A ie' -f  •  •  •  is  the  normal  to  the  hyperplane, 

and  4,  . . . ,  en  is  the  reciprocal  basis  (see  Section  10). 

If  for  n  we  take  the  unit  normal  n0  and  the  constant  term  is  ne¬ 
gative  (or  zero),  then  under  these  conditions  (1)  is  called  the 
normal  equation.  Putting  the  constant  term  C  —  — p  ( p  ^  0)  in 
this  case,  we  write  the  normal  equation  as 

(«j,  x)  —  p  =  0  (2) 

where  x  —  xlet  4-  •  •  •  +  xnen  is  the  radius  vector  of  a  running 
point  of  the  hyperplane. 

In  order  to  reduce  the  general  equation  (1)  to  normal  form, 
multiply  it  by  the  normalizing  factor 


I 


NORMAL  EQUATION  OF  A  IIYPERPLANE 


303 


§  12] 

choosing  the  sign  with  the  condition  that  mC  <  0;  then  p  = 
=  — mC  >  0  (if  C  —  0  we  can  agree  to  take  m  with  the  plus 
sign).  Obviously,  no  =  mn  is  a  unit  vector. 

Denote  by  <p  the  angle  between  no  and  x\  from  (2)  we  have 

p  =  (/»,„  x)  =  |  .v  |  •  cos  qp  (3) 

As  in  elementary  analytic  geometry,  in  the  theory  of  multidi¬ 
mensional  Euclidean  spaces  this  quantity  is  called  the  projection 
of  the  vector  x  on  the  normal  with  positive  direction  along  the 
vector  n0.  At  the  same  time  it  is  readily  seen  that  p  is  the  distance 
from  the  origin  to  the  hyperplane.  Indeed,  from  (3) 

P<!  x\ 

and  p=|*|  if  cos  qp  =  1  (qp  =  0) .  Thus,  p  is  the  length  of  the 
shortest  of  the  radius  vectors  having  termini  on  the  given  hyper¬ 


plane.  The  particular  case  of  a  two-dimensional  plane  in  three- 
dimensional  space  is  shown  in  Fig.  57. 

If  x*  is  the  radius  vector  of  some  point  Af*  not  lying  on  the  hy¬ 
perplane,  then  the  number 

6  =  (n0,  x*)  —  p  (4) 

is  the  distance  from  Af*  to  the  given  hyperplane,  the  sign  being 
minus  if  Af*  and  the  origin  0  lie  to  one  side  of  the  hyperplane 
and  plus  if  Af*  and  0  are  on  different  sides. 

To  see  that  this  is  so,  consider  a  running  point  Af  of  the  hyper¬ 
plane;  if  x  is  its  radius  vector,  then  from  (3)  and  (4)  we  have 

6  =  (n„,  xm)  —  ( no,  x)  =  —  («,„  *  —  *')  =  (/to.  M'M) 

and  so  |6|<|Af*Af|;  therefore  |A|  is  the  length  of  the  shortest 
of  the  vectors  Al*Af  (see  Fig.  58  where  the  space  is  three-dimen¬ 
sional). 


3<M 


SPACES  WITH  QUADRATIC  METRIC 


[CH.  VIII 


2.  In  the  particular  case  where  the  basis  e\,  . . . ,  en  is  orthonor¬ 
mal,  the  reciprocal  basis  coincides  with  it,  and  all  the  relations  we 
have  considered  are  completely  analogous  to  the  familiar  facts  of 
elementary  analytic  geometry.  In  this  case 

±  1 

m  =  —7=  -  _ 

AMi  +  •••  +  An 

and  the  normal  equation  may  be  written  as 

x,  cos  a  +  x2  cos  p  -f  ...  +  xn  cos  y  —  p  =  0  (5) 

where  cos  a,  cos  p,  ....  cos  y  are  the  direction  cosines  of  the 
vector  /lo¬ 
in  an  arbitrary  skew  basis,  the  notation  (5)  of  a  normal  equa¬ 
tion  is  meaningless.  But  the  fundamental  aspect  of  the  problem  of 
reducing  a  general  equation  to  normal  form  and  of  the  problem  of 
the  distance  of  a  point  to  the  hyperplane  does  not  become  more 
complicated.  The  only  thing  to  bear  in  mind  is  that  the  normalizing 
factor  is  to  be  computed  from  the  general  formula 


vz  &ikAiAk 

(see  Section  9,  Subsection  9). 

3.  Problem.  Given  on  a  Euclidean  plane  a  straight  line 
6*1  -f  8a:2  —  5  =  0  and  a  point  M*  (2,  1).  Find  the  distance  from 
the  point  to  the  line.  The  metric  tensor  is  known:  gu  =  2,  g12  = 
=  g2.  =  1,  g22  —  1  (in  the  given  coordinate  system). 

Solution.  First  of  all  let  us  verify  that  the  indicated  metric 
tensor  defines  the  Euclidean  metric.  We  have  2(xx)2-\-2xxx2jr 
+  (x2)2  =(x1)2  +  (x'  +  x2)2  ^  0,  equality  being  attained  only  for 
x'  =  x2  =  0.  This  means  the  metric  tensor  is  indeed  Euclidean. 

Inverting  the  matiix  G,  we  get  g"  =  1,  g12  =  g21  = — 1, 
g22  =  2.  Furthermore,  A\  —  6,  /42  —  8,  whence,  by  (6), 

Zg"'71A  =  68,  m  =  1/2  VT 7 
and  so  __ 

6=  15/2  Vl7 

§  13.  The  volume  of  a  parallelepiped  in  Euclidean  space. 

The  discriminant  tensor.  Vector  product 

1.  Let  the  metric  of  the  space  En  be  defined  by  specification,  re¬ 
lative  to  a  basis  e\,  ....  en,  of  the  metric  form 

Ul2=  Z  gikx‘xk 


VOLUME  OF  A  PARALLELEPIPED 


305 


§  13) 

Set  g  =  det  ||  go,  ||.  Since  the  metric  form  is  positive  definite 
in  E„ ,  it  follows  that  g  >  0. 

Consider  in  En  an  arbitrary  oriented  parallelepiped  P  constructed 
on  the  vectors  X\,  . . . ,  xn.  By  Section  5,  Chapter  VI,  we  can  deter¬ 
mine  the  oriented  volume  V  of  P  by  setting 

V  =  c^g  D(x . *„)  (1) 

where  c  is  a  constant  common  to  all  parallelepipeds.  Choosing  c 
is  tantamount  to  choosing  a  unit  of  volume.  The  fact  that  in  (1)  we 
have  used  the  discriminant  of  the  metric  form  will  help  us  to  con¬ 
nect  the  unit  of  volume  with  the  unit  of  length.  For  the  unit  we 
take  the  volume  of  an  n-dimensional  cube  with  unit  side,  that  is  to 
say,  the  volume  of  a  parallelepiped  constructed  on  the  vectors  of 
an  orthonormal  basis.  Let  e',1,  . ..,  e”  be  an  orthonormal  basis, 
go  the  determinant  of  the  metric  form  relative  to  the  basis  e°v  ... 
...,e°n.  It  is  clear  that  g0  =  1. 

On  the  other  hand,  the  matrix  made  up  of  the  components  of 
the  basis  vectors  relative  to  this  basis  is  a  unit  matrix;  with  res¬ 
pect  to  the  basis  enr  e'\  we  have  D(e°, . e“)=l.  Finally, 

by  assumption  the  volume  of  a  parallelepiped  constructed  on  the 
vectors  e°v  ....  e°n  is  equal  to  unity.  From  (1),  by  virtue  of  the 
foregoing,  we  find  c  —  1.  Thus,  with  our  choice  of  the  unit  of 
volume, 

V  =  Vg  T>(*i.  •••.  *„)  (2) 

From  this  it  follows  that  the  volume  of  a  parallelepiped  con¬ 
structed  on  the  basis  vectors  e\ . en  is  given  by  the  formula 

v  =  Vg  (3) 

2.  Since  g  =  1  in  an  orthonormal  basis,  formula  (2)  for  ortho¬ 
normal  bases  takes  on  the  simpler  aspect 

V  =  D(x i,  xn) 

3.  By  (2)  and  also  by  Section  5,  Chapter  VI,  we  have  the  dis¬ 

criminant  tensor  of  a  Euclidean  space  relative  to  any  basis 
Ci . _ 

e'.  •••<»  =  \A?  W 

Since  the  discriminant  tensor  is  skew-symmetric  with  respect  to 
all  indices,  the  system_of  equations  (4)  is  equivalent  to  the  single 
equation  eio ...  n  =  Vg'  (since  fi|2 . .  »  =  l).For  n  —  3  see  Fig.  59. 
In  orthonormal  bases 


SPACES  WITH  QUADRATIC  METRIC 


[CH.  VIII 


300 

4.  Every  linear  subspace  Lh  of  dimension  6  lying  in  En  is  itself 
a  6-dimensional  Euclidean  space.  Indeed,  the  scalar  product  of 
every  pair  of  vectors  is  defined  for  Lh  since  it  is  defined  throughout 
the  space  En.  The  metric  form  | a: | 2  is  positive  definite  in  Lh  since 
|  x  | 2  >  0  for  every  x  e  x^0. 

5.  By  the  foregoing,  the  volume  of  any  parallelepiped  (^-dimen¬ 
sional  volume)  is  defined  in  Lh.  If  an  orientation  is  given  in  Lh 


e°, 


(by  specification  of  a  basis  ai,  ....  a*),  then  also  defined  in  Lk  is 
the  oriented  volume  of  oriented  parallelepipeds. 

6.  It  is  easy  to  obtain  a  formula  expressing  the  6-dimensional 
volume  of  a  parallelepiped  constructed  on  an  arbitrary  set  of  in¬ 
dependent  vectors  ai,  ....  ah  in  En ■  Merely  take  a\,  . . . ,  a*  for  a 
basis  in  the  linear  hull  Lh  of  these  vectors.  The  metric  tensor  of 
subspace  Lh  relative  to  the  basis  au  . . . ,  ah  has  the  components 
Y ij  —  {au  aj);  from  this  and  (3)  we  have 


(ai,  a,) 

(al>  a2)  • 

..  (a,,  a*) 

V2  —  det  H  Y;/ 1|  = 

(a2,  ai) 

(a2,  a2)  .. 

•  (a2,  ak) 

( ak ,  a,) 

(ak,  a2)  . 

■  •  {ak,  ak) 

Thus  the  square  of  the  desired  6-dimensional  volume  is  given 
by  the  Gram  determinant  of  the  vectors  a1(  . . . ,  a &. 

7.  We  conclude  this  section  with  an  application  of  the  discri¬ 
minant  tensor.  In  three-dimensional  Euclidean  space  consider  the 
vector  product  2  =  [x  X  y]  of  the  vector  x  =  x'e\  +  x2e2  +  x3e3  by 
the  vector  y  =  y'e |  -f  y2e2  +  y3e3.  It  turns  out  that  the  covariant 
components  of  the  vector  product  z  =  z^e 1  -f  z2e2  -|-  z3e3  are  given 
by  the  following  simple  formula: 

H  erap*V  (5) 

From  (5)  we  immediately  get  a  formula  that  yields  the  contra- 
variant  components  of  the  vector  product,  that  is  to  say,  its  com¬ 
ponents  relative  to  the  given  basis  e\,  e2,  e3: 

2<  = 


4= 


>- 


v-Sg0--i 


(6) 


VOLUME  OF  A  PARALLELEPIPED 


307 


S  13] 

It  is  to  be  stressed  that  (5)  and  (6)  permit  computing  a  vector 
product  relative  to  any  (generally,  nonorthonormal)  basis. 

To  prove  (5),  note  first  of  all  that  both  members  are  first-order 
covariant  axial  tensors  (the  left  member  is  an  axial  tensor  by  the 
definition  of  a  vector  product  since  it  changes  sign  when  the  orien¬ 
tation  of  the  basis  is  changed;  the  right  member  is  an  axial  tensor 
because  of  the  participation  of  the  discriminant  tensor,  since  it  is 
an  axial  tensor).  Equalities  of  tensd*  quantities  are  invariant  and 
therefore  it  suffices  to  verify  (5)  relative  to  some  specially  chosen 
basis.  If  x  and  y  are  dependent,  formula  (5)  is  clearly  true  since 
the  left  and  right  members  are  then  zero.  Let  x  and  y  be  indepen¬ 
dent.  We  take  a  basis  with  the  first  two  vectors  et  =  .v,  e2  =  y. 
For  e3  take  the  unit  vector  orthogonal  to  elt  e2  (Fig.  60).  Then  the 


vector  product  z  =  Se3,  where  S  is  the  area  of  the  parallelogram 
(eu  e2)  (Fig.  61).  From  the  definition  of  reciprocal  bases  it  follows 
that  here  e3  =  e3.  Therefore  z  =  Se3  and  so  on  the  left-hand  side 
of  (5)  we  have 

zt=  0,  z2  —  0,  z3  —  S 

Since  x  —  e\  —  (1,  0,  0},  y  =  e2  =  {0,  1,0}  it  follows  that 

Z  el-up*°//p  =  e(  l , 

and  so  on  the  right  of  (5)  we  have,  for  i  —  1,  2,  3,  the  numbers 

en2  =  0,  e2|2  =  0,  e;ti2  =  et23  =  -y/ g 

But  Vg  >s  the  volume  of  the  parallelepiped  (et,  e2,  e3).  And  since 
e3  is  a  unit  vector  and  is  orthogonal  to  e\  and  e2,  this  volume  is 
equal  to  the  area  S.  Thus  -y/g  =S  and  the  proof  of  formula  (5) 
is  complete. 


Chapter  IX 


LINEAR  TRANSFORMATIONS  OF 
EUCLIDEAN  SPACE 


§  1.  Adjoint  of  a  transformation 

1.  In  n-dimensional  (real)  Euclidean  space  £„  consider  a  linear 
transformation  y  =  Ax. 

* 

Definition.  A  linear  transformation  y  —  Ax  is  said  to  be  the  ad¬ 
joint  of  the  given  transformation  A  if  for  any  x  and  z  in  En  we 
have  the  following  equation  of  scalar  products: 

(Ax,  z)  =  (x,  Az)  (1) 

* 

2.  Theorem.  The  adjoint  A  of  a  given  transformation  A  is  always 
uniquely  defined. 

Proof.  For  a  given  vector  z  we  seek  a  vector  /  such  that  the 
equation 

(Ax,  z)  =  (x,  f)  (2) 

holds  true  for  any  x<=En.  Having  found  such  a  vector  f,  set 
• 

Az  =  f.  It  is  required  to  prove  that  /  exists,  is  uniquely  defined, 
and  depends  linearly  on  z.  To  do  this,  introduce  the  basis  e,, . . . ,  e„ 

and  the  reciprocal  basis  e' . en.  Put  x  =  eh\  now  if  the  desired 

vector  /  exists,  then 

(Aek,  z)  =  (ek,  f)  (3) 

The  scalar  product  (eh,  f)  is  equal  to  the  component  fh  of  f  re¬ 
lative  to  the  basis  C|,...,e„,  and  so  from  (3)  we  get  fh  —(Aeh,z). 
Hence,  only  the  vector 

Az  =  £  (Aek ,  z)  ek  (4) 

is  the  desired  vector  f. 

* 

We  now  show  that  if  f  =  Az  is  given  by  (4),  then  the  condi¬ 
tion  (2)  holds  for  any  x  e  En.  Expand  x  in  terms  of  the  reciprocal 
basis:  .v  =  xte'  +  •  •  •  +  xnen\  putting  this  expansion  into  the  left 
member  of  (2),  wc  get 


ADJOINT  OP  A  TRANSFORMATION 


309 


S  n 

Here,  use  is  made  of  the  expression  for  the  scalar  product  of  two 
vectors,  one  of  which  is  expanded  in  terms  of  the  given  basis,  the 
other  in  terms  of  the  reciprocal  basis. 

These  calculations  yield  (4)  for  the  vector  Az  and  show  that  no 

other  value  is  possible  for  Az,  which  demonstrates  the  existence 
*  • 

and  uniqueness  of  Az.  The  linearity  of  the  transformation  Az  fol¬ 
lows  from  (4)  and  the  linearity  of  the  scalar  product  in  the  argu¬ 
ment  z.  The  proof  of  the  theorem  is  complete. 

3.  In  Chapter  VII  proof  was  given  that  a  linear  transformation  A 
is  associated  with  a  mixed  tensor  A'k-  In  space  without  a  metric, 
the  upper  and  lower  indices  of  the  tensor  are  not  related  in  any 
way.  We  are  now  considering  quadratic-metric  (inner-product) 
space  and  we  can  raise  or  lower  the  indices  of  any  tensor  accord¬ 
ing  to  Section  9  of  Chapter  VIII.  The  operations  of  raising  and 
lowering  indices  will  be  used  frequently  in  what  follows.  This 
makes  it  necessary  to  agree  on  which  of  the  indices  of  the  tensor 
Ak  is  to  be  regarded  as  the  first  and  which  as  the  second. 

We  make  the  convention  that  the  upper  index  of  the  tensor  of 
a  linear  transformation  is  the  first,  and  we  will  write  Al'k  —  A‘k. 

Lowering  the  upper  index,  we  obtain  the  covariant  components 
of  the  tensor  of  the  transformation  A: 

Aik  =  Y  g«a/4?A 

Raising  the  lower  index,  we  get  the  contravariant  components  of 
the  tensor  of  the  transformation  A: 

a" -Eg**:, 

4.  Suppose  we  have  a  matrix  A  =  ||/li||  =  |/4!*||  of  the  transfor¬ 
mation  y  —  Ax  relative  to  an  arbitrary  basis  e, . e„.  Relative 

o 

to  the  same  basis,  let  us  find  the  matrix  of  the  adjoint  A,  which  we 
denote  by  ||/1a|  =  |M^J.  Consider  the  scalar  product 
(Ax,  z)—  Y  gakAajxlzk 

a.  /,  k 

Also  consider  the  scalar  product 

(x,  Az)=  Y  g‘<ix' AakZk 

a,  /.  k 

We  have  obtained  two  bilinear  forms  that  must  be  identically 
equal.  This  is  only  possible  if  all  their  coefficients  coincide: 

Z  g*,Aak  =  Y  g9kA*j 

o  u  # 


(5) 


310  TRANSFORMATIONS  op  euclidean  SPACE  ICH.  IX 

Contracting  both  members  of  (5)  with  the  contravariant  metric 
tensor  g'i  and  using  (3),  Section  9,  Chapter  VIII,  we  get  the 
desired  expressions  of  the  elements  of  the  matrix  of  transforma¬ 
tion  A: 

A\k  =  £  8akS*A%  (6) 

At  the  same  time  this  is  an  expression  of  the  tensor  of  the  adjoint 
transformation  in  terms  of  the  tensor  of  the  given  transformation. 

Formula  (6)  may  be  written  more  briefly  as  Al.'k  —  Ak-. 

5.  The  simplest  aspect  of  the  relation  between  a  given  transfor¬ 
mation  and  the  adjoint  is  obtained  in  covariant  components.  From 
(5)  we  have 


Thus,  the  covariant  components  of  the  tensor  of  the  adjoint  are 
equal  to  the  covariant  components  of  the  tensor  of  the  given  trans¬ 
formation  with  indices  interchanged. 

The  adjoint  of  a  transformation  is  a  very  important  concept  and 
is  frequently  encountered  in  various  divisions  of  mathematics  and 
its  applications.  For  this  reason  the  notion  of  an  adjoint  transfor¬ 
mation  forms  the  basis  of  the  classification  given  below. 

6.  Let  us  indicate  the  transformations  that  are  most  simply 

related  to  their  adjoints: 

• 

(1)  A  =  A,  self-adjoint  transformations; 

* 

(2)  A  =  — A,  skew-adjoint  (or  skew)  transformations; 

(3)  A  =  A~l,  as  will  be  shown  later  on,  these  transformations 
coincide  with  isometric  transformations. 

It  will  be  proved  later  on  that  any  linear  transformation  in 
Euclidean  space  reduces  to  a  product  of  self-adjoint  and  isometric 
transformations.  In  this  connection,  a  particularly  detailed  study 
will  be  made  of  self-adjoint  and  isometric  transformations. 

Skew  transformations  play  an  important  part  in  mechanics.  We 
will  look  into  the  geometric  meaning  of  a  skew  transformation  in 
the  three-dimensional  case  (Section  6). 

§  2.  Lemma  on  the  characteristic  roots  of  a  symmetric  matrix 

I.  Let  A  =  ||  A ih  II  be  a  real  matrix  and  p(X)  =  det(A  —XE)  its 
characteristic  polynomial.  The  following  lemma  is  valid. 

Lemma.  If  a  real  matrix  A  is  symmetric,  then  all  the  roots  of 
its  characteristic  polynomial  are  real. 


§3) 


SELF-AD.fOINT  TRANSFORMATIONS 


311 


2.  Proof.  Let  X  be  an  arbitrary  root  of  the  polynomial  p{X). 
Then,  firstly,  the  system 

Z  (Aik  —  ^ik)  Xk  =  0  (/  =  1,  ....  n) 

k 

has  a  nonzero  solution  (x,,  ....  x„);  secondly,  the  number  X  is  also 
a  root  of  p(X)  since  the  coefficients  of  the  polynomial  are  real 
(here  X  is  the  conjugate  of  the  complex  number  X;  the  bar  above 
the  letter  will  have  the  same  meaning  in  the  sequel).  We  now  prove 

that  the  numbers  x\ . x„  form  a  solution  (which  is  obviously 

nontrivial)  of  the  system 

X  Mi k  —  ^&ik)  *k  —  0  (/  =  1 ,  . . . ,  re) 

k 


By  the  rules  for  operating  with  complex  numbers  we  have 

Z  (Aik  —  X6 (ft)  x 4  =  Z  (Aik  —  *k  —  Z  (Alft  —  X6 ik)  xk  =  0 

k  k  k 


Thus 


Z  ^Ik^k  i,  Z  ^ik^k  Xx i 

k  k 


Multiply  the  first  equation  by  x,-  and  the  second  by  x,  and  then 
sum  over  i: 

Z  AlkXiXk  =  X  Z  x<x,  =  X  Z  |  x,  p; 

i,  k 

2  AthXkXi  —  X  Z  XiXi  =  X  Z  |  X(  |2 

i.  k 

The  matrix  Aih  is  symmetric,  and  so 

Z  AihXiXfi  ===  Z  AhiXjX).  =  Z  AikXfcXf 

i.  k  i.  k  i.  k 

hence 

*Zi*iP=*Zl*il* 

But  the  solution  (x,,  . . . ,  xn)  is  not  zero,  i.e.,  Z  I  */ 12  #  0.  Hence 
X  =  X.  Thus  X  is  a  real  number  and  the  lemma  is  proved. 


§  3.  Self-adjoint  transformations 
1.  A  self-adjoint  transformation  A  is  characterized  by  the  con¬ 
dition  A  —  A. 

By  (5),  Section  1,  the  matrix  of  a  self-adjoint  transfor¬ 

mation,  which  matrix  is  given  relative  to  an  arbitrary  basis,  is 


312 


TRANSFORMATIONS  OF  EUCLIDEAN  SPACE 


[CH  IX 


characterized  by  the  relation 

Tj  gla^k  —Eg  Ha A“j  ( 1 ) 

whence 

=  Akt  (2) 

Thus,  a  distinguishing  feature  of  a  self-adjoint  transformation  is 
the  symmetry  of  the  matrix  of  the  covariant  components  of  its 
tensor. 

2.  Due  to  (1)  the  matrix  of  a  self-adjoint  transformation,  and 
only  of  such  a  transformation,  is  symmetric  relative  to  an  ortho¬ 
normal  basis:  /l*  —  A?.  In  other  words,  relative  to  an  orthonormal 
basis,  the  condition  of  self-adjointness  of  a  transformation  is  ex¬ 
pressed  by  the  matrix  equation  A*  =  A,  where  A  is  the  transfor¬ 
mation  matrix  and  the  asterisk  stands  for  the  transpose. 

3.  Lemma  1.  All  roots  of  the  characteristic  polynomial  of  a  self- 
adjoint  transformation  are  real. 

Proof.  We  know  that  the  characteristic  roots  of  a  transformation 
are  invariant  under  a  change  of  basis.  Let  us  pass  to  an  ortho¬ 
normal  basis.  The  transformation  matrix  then  becomes  symmetric 
and  the  assertion  of  Lemma  1  will  follow  from  the  results  of  the 
preceding  section. 

Lemma  2.  Let  e  be  an  eigenvector  of  the  self-adjoint  transforma¬ 
tion  A,  and  let  the  subspace  Lc  be  the  orthogonal  complement  of 
the  linear  hull  of  vector  e.  Then  Le  is  an  invariant  subspace  for  A. 

Proof.  Let  ret,.  This  means  that  (jc,  e)  =  0.  Because  of  self¬ 
adjointness,  (Ax,  e)  =  (x,  Ae).  Taking  advantage  of  the  fact  that  e 
is  an  eigenvector,  we  have 

(Ax,  e)  —  (x,  Ae)  —  (x,  Xe)  =  X(x,  e)  =  0 

In  other  words,  Ax  e  Le  and  the  proof  of  Lemma  2  is  complete. 

Theorem.  For  every  self-adjoint  transformation  there  is  at  least 
one  orthonormal  basis  consisting  of  eigenvectors. 

Proof.  We  carry  out  the  proof  by  induction.  In  the  one-dimen¬ 
sional  case  every  nonzero  vector  is  an  eigenvector  and  therefore 
when  n  =  1  the  theorem  holds.  Let  n  >  1  be  any  natural  number. 
Suppose  the  theorem  is  valid  for  any  self-adjoini  transformation 
in  £„  i . 

Let  A  be  a  self-adjoint  transformation  in  En.  Since  all  roots  of 
P(X)  are  real,  there  will  be  at  least  one  eigenvector  e.  We  con¬ 
struct  the  orthogonal  complement  Le  of  the  linear  hull  of  e.  The 
subspace  is  of  dimension  n—  1  and  is  an  invariant  subspace 
under  the  transformation  A.  It  is  thus  possible  to  regard  A  not  on 
the  entire  space  F.„  but  only  on  Le,  where  A  is  clearly  also  self- 


§  3]  SELF  ADJOINT  TRANSFORMATIONS  313 

adjoint.  By  the  induction  hypothesis,  there  is  an  orthonormal  basis 
e2,  . . . ,  en  in  Le  consisting  of  eigenvectors.  Adjoin  the  unit  vec¬ 
tor  e\,  which  is  collinear  witli  the  eigenvector  e.  The  vector  e\  is 
orthogonal  to  all  the  eigenvectors  e2,  . . . ,  en  and  so  we  obtain  the 
desired  basis  eu  e2,  . . . ,  en,  which  completes  the  proof  of  the  theo¬ 
rem. 

Corollary.  Every  self-adjoint  transformation  relative  to  an  ortho¬ 
normal  basis  can  be  reduced  to  diagonal  form. 

As  we  know,  the  matrix  A  relative  to  this  basis  is  written  thus: 


(3) 


where  Ai . An  are  the  collection  of  all  roots  of  the  characteristic 

polynomial 

p(A)  =  (-  1)*  (A -A,)  (A -As)  &~K)  (4) 

Here,  the  eigenvalue  Aft  corresponds  to  the  eigenvector  eh  (to  the 
basis  vector  with  the  same  number  label  k). 

4.  Lemma  3.  The  eigenvectors  corresponding  to  numerically 
distinct  characteristic  roots  are  orthogonal  to  each  other. 

Proof.  Let 

Ax  —  A,jc,  Ay  =  X2y 

where  Ai  =£  X2.  Form  the  scalar  product  of  the  first  equation  by  y, 
of  the  second  by  x,  and  subtract: 

(Ax,  y)  —  (Ay,  x)  =  A,,  (a;,  y)  —  A2  (y,  x)  =  (A,  —  A2)  (*,  y) 

Because  of  self-adjointness, 

(Ax,  y)  —  (Ay,  .v)  =  (Ax,  y)  —  (y,  A.v)  =  0 

whence  (x,  y)=  0.  Lemma  3  is  proved. 

Lemma  4.  If  A  is  a  root  of  multiplicity  m  of  the  characteristic 
polynomial  of  a  self-adjoint  transformation  A,  then  rank 
(A  —  A E)  —  n  —  m,  so  that  to  the  root  A  there  correspond  m  li¬ 
nearly  independent  eigenvectors. 

Proof.  According  to  Subsection  3,  there  is  a  basis  in  which  the 
transformation  matrix  A  is  of  diagonal  form  (3).  Relative  to  this 


314 


TRANSFORMATIONS  OR  EUCLIDEAN  SPACE 


[CH.  IX 


basis,  the  characteristic  matrix  A  —  A£  has  the  form 


A-XE  = 


0 


0 


A,„  —  A 


(5) 


For  instance,  let  Ai  be  a  root  of  multiplicity  m  of  the  charac¬ 
teristic  polynomial  (4),  that  is,  Ai  =  A2  =  ■  ■  •  =  Am,  Am+i  #= 
Ai,  . . . ,  A„  ¥=  Ai.  Then  the  first  m  diagonal  elements  in  matrix 
(5)  vanish  and  the  remaining  diagonal  elements  are  not  equal 
to  zero,  so  that 

rank  (A  —  l\E)  =  n  —  nt  (6) 


By  Section  2,  Chapter  VII,  equation  (6)  is  valid  in  any  basis. 
By  Section  6,  Chapter  VII,  the  root  Ai  is  associated  with  m  inde¬ 
pendent  eigenvectors.  The  proof  of  Lemma  4  is  complete. 

5.  To  construct  an  orthonormal  basis  consisting  of  eigenvectors 
of  a  self-adjoint  transformation,  it  is  first  of  all  necessary  to  find 

the  roots  of  the  characteristic  polynomial:  Ai,  A2 . A„.  They  are 

all  real  but  may  be  multiple.  We  will  take  note  of  which  roots  have 
the  same  values.  Say,  let  Ai  be  of  multiplicity  m: 

Aj  =  A2  —  ...  =  Am  =  A' 

Am+i  has  multiplicity  k : 

Am+|  =  ...  —  Am+j  =  A 

and  so  on.  To  the  root  A'  correspond  m  linearly  independent  eigen¬ 
vectors  e\ . em  whose  components  are  obtained  from  a  homo¬ 

geneous  linear  system  of  equations  with  matrix  A  — A 'E.  Denote  by 
L'm  the  linear  hull  of  the  vectors  eu  ...,  em.  The  subspace  Lm  is 
invariant  and  a  transformation  in  it  acts  like  a  similarity  trans¬ 
formation  with  coefficient  A'.  Therefore,  every  vector  in  lJm  is  an 
eigenvector.  Choose  in  L'm  an  arbitrary  orthonormal  basis  eu  ... 
...,  em.  It  may  be  obtained  by  orthogonalizing  the  system  of  vec¬ 
tors  eu  ■ . . ,  e,„  or  via  other  considerations. 

We  now  find  the  eigenvectors  em+i . em+k  that  correspond  to 

the  root  A";  denote  their  linear  hull  by  LI.  Like  L'm.  the  subspace 
l.l  is  invariant.  In  LI  choose  an  arbitrary  orthonormal  basis 

em  . . .  em+k.  The  vectors  em+i,  ■  ■  ■ ,  em+k  are  eigenvectors  of  the 

transformation  that  correspond  to  the  root  A".  By  Lemma  3,  each 
of  these  eigenvectors  is  orthogonal  to  any  vector  in  L'm  (in  other 
words,  L'm  1  Li). 

Consequently,  the  vectors  e\ . <?„„  . . .  em+k  taken  to¬ 

gether  form  an  orlhonormal  system. 


SELF-ADJOINT  TRANSFORMATIONS 


315 


5  31 

We  then  pass  to  the  next  root  and  construct  an  invariant  sub¬ 
space  of  appropriate  dimension  and  in  it  an  orthonormal  basis. 
Continuing  this  process,  we  obtain  the  desired  basis  after  a  finite 
number  of  operations. 

Example  1.  Given  in  three-dimensional  space  an  orthonormal 
basis  eu  e2,  e3  and  the  transformation 

V\  =  xt  +  2.v2  —  4x3, 

!/,  =  2.V|  —  2.e>  —  2*3. 

lh  =  —4x]  ~  2x2+  x3 

It  is  immaterial  whether  the  indices  on  the  components  are  sub¬ 
scripts  or  superscripts  since  the  basis  is  orthonormal  (see  Chap¬ 
ter  VIII,  Section  11,  Subsection  12). 

The  matrix  of  this  transformation  is  symmetric: 

1  2  -4 

A=  2-2-2 
-4  -2  1 

The  symmetrical  nature  of  a  matrix  in  an  orthonormal  basis  in¬ 
dicates  that  the  transformation  is  self-adjoint. 

Let  us  construct  an  orthonormal  basis  out  of  eigenvectors.  We 
write  the  system 

2  (Ak/  —  MH/)  Xj  —  0 

In  the  case  at  hand  it  is  of  the  form 

(1  —  X)  x,  H-  2x2  —  4*3  =  0,  "I 

2*,  +  (—  2  —  l)  x,  -  2*n  =  0,  >  (7) 

—  4*!  —  2*2  -f  (  1  —  A.)  *3  =  0  J 

Form  the  characteristic  polynomial: 

I  —  A,  2  -4 

p{X)  =  2  —  2  —  A,  -2  =  -  (A,3  -  27A.  -  54) 

-  4  -  2  1  —  A, 

Its  roots  are  A.|  =  6,  A.2  =  A-a  —  — 5. 

Putting  A.i  =  6  into  (7)  we  get  a  solulion  (vector): 

e,  =  {2,  I.  -2} 

Next,  substitute  A*  =  A*  =  —3  to  obtain  a  system  of  rank  one 
with  two  independent  solutions.  These  solutions  are  readily  chosen 


3 JO  TRANSFORMATIONS  OF  EUCLIDEAN  SPACE  [CH.  IX 

so  that  they  yield  orthogonal  vectors: 

e2={l,  2,  2},  e3  —  {2,  — 2,  1 } 

Normalizing  the  eigenvectors  thus  found,  we  get  the  desired 
basis: 

==  (2/3>  V3'  %}> 

^2  =  {V3>  2/s}» 

£3 =  (2/3«  2/ 3*  Vs) 

The  matrix  A  of  transformation  A  in  the  new  basis  can  be  written 
out  at  once  without  any  calculations:  it  is  of  diagonal  form.  On 
the  diagonal  we  have  eigenvalues  in  the  same  order  in  which  the 
corresponding  eigenvectors  lie  in  the  basis: 

6  0  0 
A==  0  —3  0 

0  0-3 

Example  2.  The  dimension  is  n  —  2  and  the  basis  eh  e2  is  ar¬ 
bitrary.  In  this  case,  the  metric  characteristic,  that  is,  the  compo¬ 
nents  of  the  metric  tensor,  must  be  given.  Let 

gn  =  (ei,  et)=\,  gi2  =  (eh  e2)=l, 

g2i  —  (e>i  ei)  —  1  .  g’2  —  (e2<  —  4 

The  basis  is  skew  and  so  the  orders  (of  the  tensor)  are  essential 
and  it  is  necessary  that  the  indices  be  properly  set.  Suppose  we 
have  a  self-adjoint  linear  transformation  y  —  Ax\ 

y'  —  xl  -f  4x2,  "1 
y2  ==  x‘  +  x2  ) 

It  is  required  to  reduce  it  to  an  orthonormal  basis  composed  of 
eigenvectors. 

First  of  all  we  have  to  verify  that  the  condition  of  self-adjoint¬ 
ness  is  indeed  observed,  which  means  we  have  to  be  convinced  that 
the  matrix 

A\  Al  _  1  4 

will  become  symmetric  (see  (2))  after  the  indices  are  lowered  by 
means  of  the  metric  tensor.  In  other  words,  the  tensor 

Aik  —  zL  giaAak 


must  be  symmetric. 


REDUCING  A  QUADRATIC  FORM 


317 


S  4) 

Actually,  it  is  enough  to  compare  two  components  of  this  tensor: 
A 12  and  A2i.  Their  calculation  yields: 

A12  =  Z  8la*'i*=  g,tK  +  = 

a= I 

2 

A< I  =  Z  g-,aA“  =  g,A',  +  g,2A\  =  5 

a=l 

* 

Thus,  A j2  =  A2i.  The  transformation  is  self-adjoint  and  we  can 
apply  the  general  theory. 

The  system  of  equations  £  (Aj  —  A.6/)  x1  =  0  is  written  as  fol¬ 
lows: 

(1  —  A)  a:'  -f-  4a:2  =  0, 

x'  +  (l  -A.)*2  =  0  ) 

The  characteristic  polynomial  is 

1  —  K  4 

PW=  ,  {_K  =A2-2A-3 

with  roots  A,  =  —  1,  X2  —  3. 

When  X  =  X,  we  get  xl  =  2,  x2  =  — 1  and  thus  the  eigenvector 
/i  =  {2,  —  1}.  For  \z=zX2  we  have  the  eigenvector  /2  =  {2,  1}.  By 
Lemma  3,  these  eigenvectors  are  orthogonal  in  the  given  metric. 
Let  us  calculate  their  norms: 

ll/ill2=Z^/f  =  4,  II/,  11=2; 

II 4  IP  =  Z  ga/A  =12-  II 4 II  =  2  V3 

Dividing  /,  and  l2  by  their  norms,  we  get  the  desired  basis 

£i  —  {l>  72}»  I v~\,  'li  Kl] 

Relative  to  this  basis,  the  transformation  at  hand  has  the  matrix 

An  analogous  problem  in  the  multidimensional  case  requires 
considerably  more  involved  computations. 

§  4.  Reducing  a  quadratic  form  to  canonical  form  in  an  ortho- 
normal  basis 

1.  We  know  that  every  quadratic  form  may  be  reduced  to  ca¬ 
nonical  form  in  some  basis. 

The  problem  now  involves  a  supplementary  restriction:  to  carry 
through  the  solution  within  the  class  of  orthonormal  bases. 


transformations  of  euclidean  space 


[CM.  IX 


318 

Let  a  quadratic  form 

/  (*>  x)  =  £  at/XtX, 

be  given  in  the  orthonormal  basis  e\ . e„. 

We  introduce  an  auxiliary  linear  transformation  y  =  Ax,  which 
has  the  same  matrix  as  the  quadratic  form  relative  to  the  original 
basis  e,,  . . . ,  en: 

A)  =  at, 

Using  the  fact  that  the  basis  is  orthonormal,  we  have  A1, —  At, 
(see  Section  9,  Chapter  VIII).  Besides,  a ,,  =  a^.  Hence  the  trans¬ 
formation  y  =  Ax  is  self-adjoint  and  so  we  have  to  find  the  basis 

. . .  consisting  of  the  eigenvectors  of  this  transformation. 

When  passing  to  the  basis  et . e„,  the  matrix  of  the  trans¬ 

formation  y  —  Ax  takes  the  form 

and  the  matrix  of  the  quadratic  form  is 

A'  =  PAP * 


But  the  old  basis  and  new  basis  are  orthonormal  and  so  matrix  P 
is  orthogonal;  hence, 

P  =  Q,  P*=p-|=Q~I 


From  this  it  follows  that  in  the  new  basis  the  matrices  of  the  quad¬ 
ratic  form  and  the  auxiliary  linear  transformation  coincide: 
A'  —  A. 

Relative  to  the  basis  et . en  the  transformation  matrix  A  has 

the  diagonal  form 

0 

A* 

and  so  the  quadratic  form  can  be  written  in  canonical  form  as 
f(x,  *)  =  ^  +  ... 


Here,  Xi,  . . . ,  are  the  characteristic  roots  of  the  transformation  A 
which  correspond  to  its  eigenvectors  e\,  . . . ,  <?„. 

Conclusion.  Naturally  associated  with  every  quadratic  form  spe¬ 
cified  relative  to  an  orthonormal  basis  is  a  self-adjoint  transfor¬ 
mation.  Reducing  this  transformation  to  an  orthonormal  basis  auto¬ 
matically  brings  the  quadratic  form  to  canonical  form. 


S  5] 


JOINT  REDUCTION  TO  CANONICAL  FORM 


319 


2.  Example.  Let  the  following  quadratic  form  be  given  in  an 
orthonormal  basis  of  three-dimensional  Euclidean  space: 

f  (x,  x)  =  x'j  -f-  4x,x2  —  8jc, jc.s  —  2x:2  —  4x2x3  -f  x\ 

It  is  required  to  reduce  it  to  canonical  form  also  in  an  orthonormal 
basis. 

Solution.  The  matrix  of  the  quadwlic  form  is 


l 

2 

-4 

A  = 

2 

-2 

-2 

-4 

-2 

1 

In  the  preceding  section  we  considered  a  self-adjoint  transfor¬ 
mation  with  precisely  that  matrix,  and  so  we  take  advantage  of  the 
result  and  write  down  the  answer: 

f  (x,  x)  —  6xj'  —  3xij  —  3xj 
in  the  orthonormal  basis 

^i  =  {2/3>  V3*  2/z) > 

£2  —{'A-  %•  %}, 

£3 =  {"/3>  2h>  V3} 

§  5.  The  joint  reduction  to  canonical  form  of  two  quadratic  forms 

1.  Given  in  a  linear  space  (without  a  metric)  a  fixed  arbitrary 
basis  e\, . . . ,  en  and  two  quadratic  forms: 

f  (*>  *)  =  £  atkxlxk,  g  (x,  x)  =  Y,  gikx‘xk 

Now,  can  a  basis  be  found,  in  which  both  quadratic  forms  will 
take  on  canonical  form? 

Theorem.  If  at  least  one  of  two  quadratic  forms  is  positive  defi¬ 
nite,  there  will  be  a  basis  in  which  both  forms  assume  canonical 
form. 

Proof.  Let  the  form  g(x,  x)  be  positive  definite.  Introduce  a 
Euclidean  metric  in  the  linear  space,  taking  g(x,  x)  for  the  metric 
form.  If  in  the  resulting  Euclidean  space  we  take  an  arbitrary 
orthonormal  basis,  the  metric  form  g(x,  x)  assumes  normal  form. 
Then  we  pass  to  another  orthonormal  basis  so  that  f(x,  x)  reduces 
to  canonical  form.  In  the  process,  the  normal  form  of  the  metric 
form  will  clearly  remain  intact.  The  theorem  is  proved. 

2.  In  the  practical  case  of  a  joint  reduction  of  two  quadratic 
forms  to  canonical  form  it  is  not  necessary  to  break  the  search  for 
the  desired  basis  into  two  stages,  as  is  done  in  the  proof. 


320 


TRANSFORMATIONS  OF  EUCLIDEAN  SPACE 


[CH.  IX 


By  Section  4,  a  quadratic  form  may  be  reduced  to  canonical  form 
without  going  outside  the  class  of  orthonormal  bases.  Such  a  reduc¬ 
tion  is  done  via  a  self-adjoint  transformation.  The  fact  that  the 
transformation  is  self-adjoint  does  not  depend  on  the  choice  of 
basis.  We  therefore  do  as  follows. 

Take  g(x,  x)  for  the  metric  form  and,  using  the  coefficients  of 
f(x,  x),  seek  in  the  given  basis  an  auxiliary  self-adjoint  transfor¬ 
mation  y  =  Ax.  Its  covariant  tensor  Ai}  must  be  symmetric  and 
must  be  determined  by  the  tensor  equation  An  —  a(j,  which  holds 
true  in  any  basis.  Now,  to  find  the  transformation  matrix  A  all  we 
need  to  do  is  raise  an  index  of  the  tensor  Atj  via  the  metric  tensor: 

Akj  =  £  Aa/gka  =  £  g'/aaai  (1) 

All  the  quantities  in  the  right-hand  member  of  this  equation  are 
given,  and  so  the  matter  reduced  to  finding  the  eigenvalues 
Ai,  . . . ,  A„  and  the  orthonormal  basis  of  the  corresponding  eigen¬ 
vectors  e\ . en  of  the  self-adjoint  transformation  with  the  known 

matrix  llAyl  See  Subsection  5  of  Section  3. 

3.  Further  simplifications  are  possible.  It  turns  out  that  there  is 
no  need  to  compute  A /  =  Ay  in  order  to  find  the  eigenvectors 
<?i, . . . ,  en  and  the  eigenvalues  Ai,  . . . ,  A„. 

Indeed,  taking  into  account  (1),  we  have 

akl  ~  *£*/ =  £  Ska  iAi  ^0  ^2) 

and  so  the  system  of  equations 

EOtf-M/V  —  o  0) 


is  equivalent,  for  any  A,  to  the  system 

£  (a*/  ~  Igki)  x‘  =  0  (4) 

Namely,  due  to  (1)  and  (2)  we  have  the  equations 

£  ( akt  -  lSki)  x‘  =  Z  gka  Z  (a'i  -  w;)  x1, 

£  (A;  -  A6;)  x1  =  £  gka  £  (aaj  -  Aga/)  x1 

from  which  clearly  follows  the  equivalence  of  the  systems  (3) 
and  (4). 

Thus,  everything  reduces  to  solving  system  (4),  and  in  place  of 
the  characteristic  polynomial  p( A)  we  have  to  consider  the  poly¬ 
nomial 


<?  (A)  =  del  (akj  -  A gu) 


•JOINT  REDUCTION  TO  CANONICAL  FORM 


321 


9  5] 

But  it  lias  exactly  the  same  roots  (counted  according  to  multi¬ 
plicity)  as  the  characteristic  polynomial 

p  (X)  =  det  (A*  -  X&i) 

For,  due  to  (2). 

< I  (X)  =  g  •  p  (A.) 

for  any  X,  where  g  =  det  C  =#  0.  Thu*  q(X)  differs  from  p(X)  only 
in  a  factor  that  does  not  include  X.  Observe  that  q(X)  and  system 
(4)  can  be  written  down  immediately  from  the  given  quadratic 
forms. 

4.  Summary.  Given  in  an  arbitrary  basis  elt  . . . ,  e„  two  quad¬ 
ratic  forms: 


f(x,  *)=  E  rt//vV,  g{x,  x)=  E  gqx'x1 

where  g(x,  x)  is  positive  definite.  To  reduce  them  jointly  to  ca¬ 
nonical  form  we  solve  the  characteristic  equation 

q{X)  =  det(akl  —  Xgk,)  =  0 

Let  Xi,  ...  ,Xh  be  the  roots  (all  real). 

Substitute  these  roots  in  succession  in  place  of  X  into  the  system 
(4)  and  for  each  root  find  solutions  {*•>}  of  this  system;  if  the  mul¬ 
tiplicity  of  root  X  is  equal  to  m,  then  there  are  m  linearly  indepen¬ 
dent  solutions  (see  Lemma  4,  Section  3).  These  solutions  must  be 
chosen  so  that  they  form  an  orthonormal  system  in  the  metric 
g(x,  x). 

Then  all  the  solutions  thus  found  will  yield  the  matrix  of  the 
components  of  the  vectors  of  the  desired  basis  et,  ....  e„.  If  the 
numbering  is  correct,  that  is,  vector  ek  corresponds  to  root  Xh,  then 
relative  to  the  basis  e[t  . . . ,  e„  ‘the  given  quadratic  forms  will  as¬ 
sume  the  form 

f  =  EM*')2.  g  =  E(*')2 


5.  Example,  n  =  2.  Given  the  forms 

f  (x,  x)  =  2  (x1)1  -f-  I  O.v'x2  -j-  8  (x2)2, 
g(x,x)=  (x')2+  2.t'x2  +  4  (*2)2 

Reduce  them  jointly  to  canonical  form. 

It  is  easy  to  verify  that  g(x,  x)  is  positive  definite.  This  is 
evident,  for  instance,  from  the  equation  g{x,  x)  —  (x1  -f  *2)2  + 
-j-3(x2)2.  A  joint  reduction  is  possible  and  it  can  be  carried  out 
by  the  foregoing  procedure. 


H-661 


tRANSFORMATIONS  OF  EUCLIDEAN  SPACE 


[CH.  IX 


322 


Write  out  the  matrices  of  the  quadratic  forms: 


2 

5 

1  1 

F  = 

5 

8 

.  G  — 

1  4 

ami  form  the  characteristic  polynomial: 


q  {X)  —  det  (F  —  XG)  = 


2  — X  5  —  X 
5- X  8  —  4X 


The  roots  are  A,i  =  —  1,  X2  —  3.  System  (4)  becomes 
(2  —  X)  x'  +  (5  —  A,)  x2  =  0,  | 

(5  —  A.)  x1  -f  (8  —  4A)  x2  —  0  J 

For  X  —  Xi  =  — 1  we  find  the  solution  x1  —  2,  x2  =  — 1  and 

write  it  as  a  vector:  4  =  {2,  — 1};  for  X  =  X2  =  3  we  find  the  so¬ 

lution  l2  =  {2,  1}.  The  vectors  1 1  and  l2  are  orthogonal  in  the 
metric  g  because  they  are  the  eigenvectors  of  the  auxiliary  operator 
that  is  self-adjoint  in  the  metric  g  and  correspond  to  different 
eigenvalues.  The  basis  in  which  the  forms  are  given  is  not  ortho¬ 
gonal  in  the  metric  g.  For  this  reason,  the  orthogonality  of  4  and 
l2  is  not  immediately  apparent.  But  in  checking  the  computations 
the  reader  will  see  that  (4,  l2)  —  g(l\,  l2)  =  0. 

Let  us  now  compute  the  norms  of  4  and  l2  in  the  metric  g: 

l|/ill=  Vg('i,  /.)  =  2,  II 4 II  =  Vs  (4.  k)  =  V 12 

Normalize  the  basis: 

e,  =  {\,  -V2},  e2  =  {2/VT2,  1/VT2) 

In  the  basis  et,  e2  the  quadratic  forms  are 
f  =  -(x')2  +  3(x2)2, 
g  =  (  Jc')2-F(JE2)2 

Remark.  The  eigenvectors  l\  and  4  also  form  a  basis  in  which 
both  forms  have  canonical  form,  but  with  different  coefficients  of 
the  squares.  The  normalization  of  4  and  4  in  the  metric  g  is 
needed  so  as  to  be  able  to  write  out  the  reduced  quadratic  forms 
without  any  supplementary  computation  of  their  coefficients  and 
make  use  of  the  earlier  found  roots  of  the  characteristic  equation. 


§  6.  Skew-adjoint  transformations 

I.  Recall  that  a  linear  transformation  z  —  Ay  in  Euclidean  space 

* 

is  said  to  be  skew-adjoint  (or  skew)  if  =  — A. 

Exhibiting  this  relation  relative  to  some  basis,  we  get 

Ai’k^  —  Aik 


SKEW-ADJOINT  TRANSFORMATIONS 


323 


*6] 


Therefore,  the  condition  that  A  is  skew-adjoint  in  the  components 
(relative  to  an  arbitrary  basis)  is  written  thus: 

I  Stjnk =  -  £ 
or 

A/k  —  —  A ki 

In  an  orthonormal  basis,  the  orders  of  the  tensors  are  of  an 
equal  status,  and  the  matrix  of  a  skew-adjoint  transformation  will 
also  be  skew-symmetric. 

Note  that  a  linear  transformation  can  always  be  written  so  that 
it  is  defined  by  covariant  components.  To  do  this,  it  is  necessary  to 
represent  the  argument  y  in  the  given  basis: 

y  =  y'et  +y2e2+  ...  -f  ynen 

and  expand  the  function  z  in  terms  of  the  reciprocal  basis: 

2  =  z,e'  +  z2e2  +  . . .  +  znen 
Then  transformation  z  =  Ay  can  be  written  as 

2i  —  £  Alkyk 

where  Aih  is  the  covariant  tensor  of  the  transformation  A. 


2.  We  now  prove  that  in  the  three-dimensional  case  the  skew 
transformation  z  =  Ay  can  be  represented  as  a  vector  multiplica¬ 
tion  of  a  fixed  vector  a  by  the  vector  y. 

Theorem,  If  z  —  Ay  is  a  skew-adjoint  transformation  in  three- 
dimensional  Euclidean  space ,  then  there  exists  a  unique  vector  a 
such  that 

z  ==  [a  X  //] 

Remark.  Since  the  transformation  Ay  is  invariant  under  a 
change  of  basis,  the  vector  a  in  the  equation  Ay  —  fa  X  y]  is  not 
invariant  but  axial  (an  axial  tensor  of  order  one). 

Proof  of  the  theorem.  We  know  that 

2;  =  £  Alkyk 

On  the  other  hand,  if  z  —  [a  X  y],  then 

zi  =  Z  e.aftaV 

where  e,aft  is  the  discriminant  tensor. 

To  prove  the  theorem,  we  have  to  ensure  the  following  equa¬ 
tions: 

£  e,afefla  =  Aik 

where  Aih  is  the  given  tensor,  Ahi  =  —Aih  ( i ,  k  —  1,  2,  3),  and 


324 


TRANSFORMATIONS  OF  EUCLIDEAN  SPACE 


[CH.  IX 


a  —  {a1,  a2,  a3}  is  the  desired  vector.  We  have  a  system  of  nine 
linear  equations  for  the  three  components  of  the  vector  a.  We  have 
to  prove  its  consistency  by  utilizing  the  skew  symmetry  of  the  gi¬ 
ven  tensor  An,  and  the  special  form  of  the  lefthand  member.  The 
system  of  equations  is  written  as  a  tensor  equation  and  so  we  can 
verify  the  equation  in  terms  of  arbitrary  components,  and  thus 
simplify  the  problem  via  a  special  choice  of  components.  We  rea¬ 
son  as  follows. 

The  determinant  of  a  skew-symmetric  matrix  of  odd  order  is 
always  zero.  (This  property  is  obvious  for  the  three-dimensional 
case  at  hand.)  For  this  reason,  the  bilinear  form 

f  («,  v)  =  £  Aiku‘vk 

which  corresponds  to  the  tensor  ztift  is  singular  and  its  zero  sub¬ 
space  has  dimension  not  less  than  unity. 

Choose  an  orthonormal  basis  e\,  e2,  e3  so  that  the  vector  e3  is  in 
the  zero  subspace  of  the  bilinear  form  (it  is  immaterial  whether  it 
is  in  the  right  or  left  subspace). 

Then  the  matrix  An,  is  simplified  in  the  following  manner: 


0 

Al2 

0 

II 

—  Ai2 

0 

0 

0 

0 

0 

The  principal  component  of  the  discriminant  tensor  6123  =  1  be¬ 
cause  the  basis  is  orthonormal,  whence,  as  is  readily  computed, 


0 

—  a3 

a2 

II  r  eiakaa  II  = 

a3 

0 

-a' 

—  a2 

a' 

0 

It  is  now  clear  that  for  the  matrices  ||  Ar,  ||  and  fl  £  e,al!aa||  to 
coincide,  it  is  necessary  to  put 

a.  —  (a1,  a2,  a3}  =  (0,  0,  —  Al2} 

No  other  vector  is  suitable.  The  proof  of  the  theorem  is  complete. 

3.  Mechanical  interpretation.  In  three-dimensional  Euclidean 
space,  fix  a  point  O  and  through  it  draw  a  straight  line  collinear 
with  a  vector  a.  Take  this  line  for  the  axis  of  rotation.  Let  us  find 
the  distribution  of  linear  velocities  of  points  of  a  rigid  body  rotat¬ 
ing  about  this  axis  with  constant  angular  velocity  o)=|a|.  The 
linear  velocity  v  depends  solely  on  the  position  that  the  moving 
point  occupies  at  a  given  time.  This  position  is  characterized  by  a 
radius  vector  OM  =  y  (the  moving  point  of  the  body  passes 
through  the  geometric  point  M  in  space).  The  velocity  v  is  ortho- 


ISOMETRIC  TRANSFORMATIONS 


325 


5  7] 

gonal  to  the  plane  in  which  the  vectors  a  and  y  lie.  The  numerical 
value  of  the  linear  velocity  |o|  is  equal  to  the  product  of  the  an¬ 
gular  velocity  |a|  by  the  distance  of  the  point  from  the  axis  of 
rotation,  and  this  product  coincides  with  the  area  of  the  parallelo¬ 
gram  constructed  on  the  vectors  a  and  y  (Fig.  62).  So  with  a 
proper  choice  of  orientation  of  the  basis,  the  velocity  of  any  point 
of  the  rotating  body  is  expressed  by  the  formula  v  =  [a  X  yl 
Thus,  any  skew  transformation  Mj  in  C3  may  be  interpreted  as 
a  distribution  of  velocities  of  a  uniformly  rotating  body:  the  point 


with  radius  vector  y  has  an  instantaneous  linear  velocity  v  =  Ay; 
the  vector  a  of  angular  velocity,  and  so  also  the  axis  of  rotation, 
are  found  via  Subsection  2. 

§  7.  Isometric  transformations 

1.  Definition.  A  linear  transformation  /  is  said  to  be  isometric 
if  it  preserves  the  norm  of  every  vector: 

II  /*  II  =  II  x  ||  (i) 

From  now  on  we  will  be  dealing  solely  with  linear  transforma¬ 
tions  in  Euclidean  spaces. 

2.  From  the  definition  it  follows  that  if  an  isometric  transforma¬ 
tion  exists,  then  it  is  nonsingular,  since  a  singular  transformation 
would  carry  a  nonzero  vector  into  a  zero  vector.  Therefore  the  iso¬ 
metric  transformation  z  =  ly  has  an  inverse:  y  =  I  'z,  which  is 
also  isometric. 

Remark.  For  the  in verti bil ily  of  an  isometric  transformation  it 
is  essential  to  assume  that  the  space  is  finite-dimensional. 

3.  Theorem.  If  I  is  an  isometric  transformation,  then  ( !x,Iy)  = 
—  ( x ,  y)  for  any  pair  of  vectors  x,  y. 

Corollary.  Since  an  isometric  transformation  preserves  norms 
and  scalar  products,  it  also  preserves  the  angle  between  any  two 


32fi 


TRANSFORMATIONS  OF  EUCLIDEAN  SPACE 


[CH.  IX 


vectors,  that  is,  the  angle  between  them  is  equal  to  the  angle 
between  their  images. 

Proof  of  the  theorem.  Substituting  into  (1)  the  sum  x -\- y  in 
place  of  x  and  squaring  both  members,  we  get 

II  /  U  +  y)  II2  =  IU  + 1/  IP 

or 

(l(x  +  y),  I(x  +  y))  =  (x  +  y,  x  +  y) 

Using  the  linearity  of  /,  we  obtain 

(lx,  lx)  -f  2  {lx,  ly)  +  (ly,  ly)  =  {x,  x)  +  2  (x,  y)  +  (y,  y) 

But  here  (lx,  lx)=  \\  IxW*  =(x,  x),  (ly,  ly)  =  II  ly  ||2  =  (y,  y) 
and  so 

(lx,  ly)  =  (x,  y)  (2) 

which  proves  the  theorem. 

4.  Henceforth  we  assume  the  space  to  be  Euclidean  and  n-di- 
mensional  and  we  denote  it  by 

Set  ly  =  2  in  (2).  Then  y  —  l~'z,  whence 

(lx,  z)  =  (x,  r'z)  (2') 

When  the  vector  y  runs  through  the  entire  space  E„,  the  vector  z 
does  likewise  because  /  is  nonsingular.  Thus,  (2')  holds  for  all 
vectors  x,  z  in  En.  This  means  that 

r'=i  0) 

Remark.  It  is  readily  verified  that  the  three  conditions  (1),  (2) 
and  (3)  are  equivalent,  that  is,  each  one  implies  the  other  two. 

5.  The  relation  (3)  may  be  rewritten  as  follows: 

/*/  =  //  =  £ 


where  E  is  the  identical  transformation,  whence  it  follows  that  the 
matrix  of  /  is  orthogonal  relative  to  an  orthonormal  basis.  For  this 
reason,  isometric  transformations  are  the  only  ones  which  have 
orthogonal  matrices  in  an  orthonormal  basis. 

0.  Let  e\, . . ,  ,en  be  an  orthonormal  basis  and  let 
te\  —  I\\e\  +  Io^e,  -f  ...  +/„!«„, 


ICn~  I  In?  I  +  2  +  •  •  •  +  Innen 


ISOMETRIC  TRANSFORMATIONS 


327 


§7) 

The  vectors  le i,  ....  Ien  also  form  an  orthonormal  basis  because 
an  isometric  transformation  sends  unit  vectors  into  unit  vectors, 
and  orthogonal  vectors  into  orthogonal  vectors. 

By  the  foregoing,  the  transformation  matrix  ||  /«  ||  is  orthogonal. 

Thus,  with  every  isometric  transformation  is  associated  an  ortho¬ 
gonal  matrix,  and  with  every  orthogonal  matrix  is  associated  an 
isometric  transformation.  For  every  pair  of  orthonormal  bases 
there  is  a  unique  isometric  transformation  that  carries  one  of  the 
given  bases  into  the  other. 

7.  Note  some  properties  of  isometric  transformations. 

(1)  If  the  transformation  /  has  an  eigenvector  e,  that  is,  if 
le  —  \e,e^=  0,  then  \  —  ±1  (this  follows  immediately  from  the 
preservation  of  norm  of  a  vector). 

(2)  det  /  =  ±  i,  for  the  deteiminant  of  an  orthogonal  matrix 
is  always  equal  to  ±  1,  and  det  /  is  an  invariant.  Hence  to  prove 
this  property  it  suffices  to  consider  the  transformation  /  in  an  or¬ 
thonormal  basis. 

If  det  /  =  +  1,  then  the  bases  elt  . . . ,  e„  and  le i,  . . . ,  Ien  are 
of  the  same  orientation  and  we  have  a  transformation  similar  to 
the  motion  of  a  rigid  body. 

If  det  /  =  — 1,  the  bases  e( . en  and  le i,  ...,  len  have 

different  orientations  and  the  transformation  is  a  reflection. 

(3)  If  e  is  an  eigenvector  and  Le  is  the  orthogonal  complement 
of  the  linear  hull  of  e,  then  Le  is  an  invariant  subspace. 

Proof.  Let  xet(,  then  (lx,  le)  =  (x,  e)  =  0.  On  the  other  hand, 
(lx,  le)  —  X(lx,  e)  so  that  (lx,  e)=  0,  that  is,  e  Le. 


8.  Let  us  consider  some  examples.  We  assume  that  the  transfor¬ 
mation  matrices  are  written  out  in  orthonormal  bases.  The  ortho¬ 
gonality  of  the  matrices  (and  hence  the  isometric  nature  of  the 
transformations  that  follow)  is  established  by  a  simple  check, 
which  we  leave  to  the  reader. 

(1)  The  identical  transformation  is  isometric. 

(2)  The  reflection  of  n-dimensional  space  about  the  hyperplane 
Xi  =  0  carries  an  arbitrary  vector  x  =  X\e\  +  x2e2  +  . . .  -f-  xnen 
into  the  vector  lx  =  —  x,e,  x2e2  -f- . . .  -f-  xne„.  The  matrix  of 
this  transformation  is 


/  = 


-1 


0 


0 


1 


328 


TRANSFORMATIONS  OF  EUCLIDEAN  SPACE 


[CH.  tX 


The  subspace  X\  =0  is  invariant  and  the  transformation  induced 
in  it  is  identical. 

(3)  n  —  1.  In  a  one-dimensional  space,  an  isometric  transfor¬ 
mation  is  either  identical,  lx  —  x,  or  is  a  reflection,  lx  =  — x. 

Indeed,  in  the  one-dimensional  case  the  matrix  /  consists  of  only 
one  element  lu.  Taking  into  account  the  second  property  of  the 
preceding  subsection,  we  find  /n  =  det  ||  I\\  ||  =  ±  1. 

(4)  n  =  2.  A  rotation  of  a  plane  through  an  angle  0  is  an  iso¬ 
metric  transformation  (see  Section  7,  Chapter  VIII). 

(5)  n  —  3.  The  rotation  of  space  through  an  angle  0  about  the 
axis  x3  is  given  by  the  matrix 


COS0 

—  sin  0 

0 

/  = 

sin0 

COS0 

0 

0 

0 

1 

e3  is  an  eigenvector  and  the  subspaces  L(e3),  L(e i,  e2)  are  inva¬ 
riant. 

(6)  A  multidimensional  generalization  of  the  preceding  example 
is  the  rotation  of  fi-dimensional  space  about  an  (n  —  2) -dimen¬ 
sional  subspace  L(e3 . e„): 


COS0 

—  sin  0 

0 

sin0 

COS0 

1 

0 

1 

The  subspace  L(e3,  . . . ,  en)  remains  fixed  and  an  identical  trans¬ 
formation  is  induced  in  it. 

(7)  n  ==  4.  A  transformation  with  the  matrix 


cos  a  —sin  a  0  0 

sin  a  cos  a  0  0 

0  0  cosP  —  sinp 

0  0  sinj}  cosp 


may  be  regarded  as  a  simultaneous  rotation  of  the  space  E^  in 
two  mutually  orthogonal  directions:  through  the  angle  a  about  the 


ISOMETRIC  TRANSFORMATIONS 


329 


*  n 

plane  L(e3,  e4)  and  through  the  angle  B  about  the  plane 
L(e i,  e2) . 

The  planes  L(e u  e2)  and  L(e3,  e4)  are  invariant  subspaces.  If  the 
angles  a,  p  are  not  multiples  of  n,  then  the  transformation  /  does 
not  have  any  eigenvectors  since  its  characteristic  polynomial 

p(X)  =  (tf—  2Uosa+  —  2A.cosp  +  1) 

has  only  complex  roots. 

9.  To  get  a  better  feeling  of  the  specific  nature  of  isometric 
transformations  in  spaces  of  dimension  greater  than  three,  com¬ 
pare  the  last  example  with  a  rotation  of  three-dimensional  space. 

We  assume  that  the  spaces  rotate  uniformly  and  the  matrices 
in  examples  (5)  and  (7)  of  Subsection  8  characterize  the  rotation 
in  unit  time.  Then  a  rotation  in  the  three-dimensional  case  during 
time  t  is  specified  by  the  matrix 

cosBi  —  sin0/  0 
I(t)  =  sin  Qt  'cos  Qt  0 
0  0  1 

in  the  four-dimensional  case  by  the  matrix 

cos  at  —  sin  at  0  0 

sin  at  cos  at  0  0 

0  0  cos  pi  —sin  pi 

0  0  sin  pi  cos  pi 

Clearly,  in  the  three-dimensional  case  all  points  of  the  axis  of 
rotation  are  fixed,  while  the  remaining  points  of  the  space  describe 
circles  whose  centres  lie  on  the  axis  of  rotation.  The  planes  of 
these  circles  are  perpendicular  to  the  axis  of  rotation.  Each  point 
performs  a  complete  rotation  during  the  same  time  T  =  2n/0. 

The  picture  is  different  in  the  four-dimensional  case.  Only  the 
origin  is  fixed.  The  points  of  the  invariant  planes  L(elt  e2)  and 
L(e 3,  e4)  move  in  circles  centred  at  the  origin,  but  the  rotation 
periods  about  the  origin  in  these  two  planes  differ:  in  the  plane 
L(e i,  e2)  the  period  is  T ,  =  2n/a  and  in  the  plane  L(e3,  e4)  the 
period  is  T2  =  2n/p.  If  the  angles  a  and  p  are  incommensurable, 
then  every  point  of  space  that  docs  not  belong  to  one  of  these  two 
planes  moves  along  a  closed  path  without  self-intersections  and 
will  never  return  to  the  original  position.  Indeed,  for  a  point  to 
return  to  its  original  position,  it  is  necessary  and  sufficient  that 


330 


transformations  of  euclidean  space 


[CH.  IX 


its  projections  on  the  planes  L(eu  e2)  and  L(e3,  e4)  return  simul¬ 
taneously  to  their  initial  positions,  which  is  impossible  since  the 
periods  T\  and  72  are  incommensurable. 

If  the  angles  a  and  p  are  commensurable,  the  periods  T\  and  T2 
are  also  commensurable,  and  then  the  paths  of  all  points  are 
closed  but  will  not  in  general  be  circles.  In  this  case,  the  rotation 
period  of  points  not  lying  in  the  planes  L(e ,,  e2)  and  L(e3,  e4)  is 
equal  to  the  least  common  multiple  of  the  periods  T\  and  T2. 

§  8.  The  canonical  form  of  an  isometric  transformation 

1.  Theorem.  For  every  isometric  transformation  1  in  n-dimen- 
sional  Euclidean  space  E„  there  exists  an  orthonormal  basis 
eit  ....  £n  in  which  the  transformation  matrix  has  the  following 
canonical  form 


±  1 

0 

1 

• 

• 

•  1 

/9l 

• 

0 

Here  we  use  /e  to  denote  the  matrix  of  rotation  of  a  two-dimen¬ 
sional  plane  through  the  angle  0: 


COS  0 

—  sin0 

sin  0 

cos  0 

The  sign  ±  in  front  of  the  first  diagonal  element  of  matrix  (1) 
coincides  with  the  sign  of  the  determinant  of  the  transformation. 
The  submatrices  /«,  may  be  altogether  absent  (compare  the  first 
three  examples  of  Subsection  4,  Section  7)  or  may  occupy  the 
entire  diagonal  (see  example  (7)  of  Subsection  4,  Section  7). 

The  proof  of  the  theorem  is  given  below  in  Subsection  8  of  this 
section. 

2.  Now  let  us  look  into  the  geometric  meaning  of  this  theorem. 
It  is  easy  to  verify  that  matrix  (1)  can  be  represented  as  a  pro- 


*8] 


CANONICAL  FORM  OF  ISOMETRIC  TRANSFORMATION 


33! 


duct: 


1  0 
0  /,, 


1 

0 

1 

X 

/ 

i 

o 

i 

±  1 

0 


I 


0 


(2) 


1 


Formula  (2)  shows  that  the  transformation  /  reduces  to  the  fol¬ 
lowing. 

If  det  /  =  — 1,  then  first  a  reflection  is  performed  about  the 
hyperplane,  which,  relative  to  the  basis  eu  ....  e„,  has  the  equa¬ 
tion  Xi  =  0  (if  det  /  =  -f  1,  there  is  no  reflection). 

Following  that,  the  linear  hull  of  the  first  n  —  2k  basis  vectors  of 
L(e i,  ....  en~2h)  remains  fixed,  while  the  entire  space  is  succes¬ 
sively  rotated  through  the  angles  0j,  . . . ,  0h  about  the  (n  —  2) -di¬ 
mensional  subspaces  L(e ,,  ....  e„-2k+3,  ■■■,  en),  ... 

....  L(e . . e„_2).  All  two-dimensional  planes  L(en_2*+ 1, 

en-2k+2),  L(en-\,  e„)  in  the  directions  of  which  these  rotations 

occur  are  orthogonal  among  themselves  and  also  to  the  subspace 
L(e\,  . . . ,  en_2fc) . 


3.  Corollary  to  the  theorem  of  Subsection  1.  (1)  Let  the  iso¬ 
metric  transformation  /  have  det  /  =  — 1.  Then  it  has  an  eigenvec¬ 
tor  and,  for  even  n,  at  least  two  independent  eigenvectors. 

(2)  In  two-dimensional  Euclidean  space  only  three  types  of  iso¬ 
metric  transformations  are  possible: 

(a)  the  identical  transformation; 

(b)  a  reflection  relative  to  some  one-dimensional  subspace; 

(c)  a  rotation  through  an  angle  0(0  <  0  <  2ji). 

(3)  In  three-dimensional  Euclidean  space  only  four  types  of 
isometric  transformations  are  possible: 

(a)  the  identical  transformation; 


332 


TRANSFORMATIONS  OF  EUCLIDEAN  SPACE 


ICH.  IX 


(b)  a  reflection  relative  to  a  two-dimensional  subspace; 

(c)  a  rotation  through  the  angle  0(O<0<2ji)  about  a  one¬ 
dimensional  subspace; 

(d)  the  product  of  a  reflection  relative  to  some  two-dimensional 
subspace  by  a  rotation  about  its  orthogonal  complement. 

4.  We  now  take  up  the  proof  of  the  theorem  stated  in  Subsec¬ 
tion  1.  First,  however,  in  Subsections  5-7  we  will  establish  a  few 
auxiliary  propositions  that  are  of  interest  in  themselves. 

5.  Lemma  1.  For  every  linear  transformation  in  a  real  linear 
space  L  there  exists  either  a  one-dimensional  invariant  subspace  or 
a  two-dimensional  invariant  subspace  such  that  the  transforma¬ 
tion  induced  in  it  has  a  positive  determinant. 

Proof.  If  the  characteristic  polynomial  p(X)  has  a  real  root  Xi, 
then  by  Section  6,  Chapter  VII,  to  this  root  corresponds  an  eigen¬ 
vector  whose  linear  hull  is  an  invariant  subspace. 

Let  p(l)  have  no  real  roots.  Then  the  transformation  A  does  not 
have  a  single  eigenvector. 

We  write  down  the  relation  ( A — X.C)x  —  Q  relative  to  an  ar¬ 
bitrary  basis  et,  ....  e„  and  substitute  for  \  the  complex  root 
a  +  »P  of  the  characteristic  polynomial  p(X).  This  yields  a  homo¬ 
geneous  system  of  linear  equation  with  the  unknowns  xl . x11 

and  with  complex  coefficients.  This  system  can  be  written  in 
matrix  form  as  follows: 

x1  0 

{A  —  (a  +  i‘P)  E)  •  =  i  (3) 

x"  0 

The  determinant  of  system  (3)  is  zero: 

det  (A  -  (a  +  /p)  E)  =  p(a  +  /p)  =  0 

and  so  (3)  has  the  nontrivial  solution  (x1 . x").  Decompose  it 

into  the  real  and  imaginary  parts: 

y[  +  iz' 


yn  +  izn 

and  consider  the  vectors 

y  =  y'e i  +  •  •  •  +  ynen  e  L, 

z  =  z*e [  znen  s=  L 


(4) 


§8] 


CANONICAL  PORM  OF  [SOMFTRIC  TRANSFORMATION 


333 


We  use  the  same  symbols  y  and  z  to  denote  the  elements 

(y\  ■  ■  ■ ,  yn )  and  (21 . 2")  of  the  coordinate  (component) 

space  Kn.  Putting  solution  (4)  into  system  (3)  we  get 


0 

0 


=  (A  —  (a  -f  /p)  E)  (y  +  iz)  = {Ay  —  ay  +  P2)  -f  i.(Az  —  az  -f  p y) 


whence 

Ay  —  ay  —  P2,  4 
Az  =  fry  -f  02  j 


(5) 


Equations  (5)  are  obtained  algebraically  as  relations  between  the 
elements  of  the  coordinate  space  Kn ■  Geometrically,  formulas  (5) 
express  the  action  of  the  transformation  A  on  the  vectors  j,2et, 

written  in  the  basis  e\ . en.  But  the  transformation  does  not 

depend  on  the  choice  of  basis,  and  so  (5)  may  be  regarded  as  in¬ 
variant  vector  equations  in  the  space  L. 

We  now  show  that  the  vectors  y  and  2  are  linearly  independent. 
First  of  all,  2^0,  since  otherwise  the  first  of  equations  (5)  would 
signify  that  L  has  an  eigenvector  y(Ay  =  ay).  Therefore,  if  there 
is  a  linear  relationship  between  y  and  2,  then 

y  =  \z  (6) 


Substituting  (6)  into  the  second  equation  of  (5),  we  get  Az  = 
=  (oc  +  Py)z.  which  likewise  contradicts  the  absence  of  eigen¬ 
vectors  in  the  transformation  A. 

Thus,  y  and  2  are  linearty  independent  and  their  linear  hull 
L(y,  z)  is  two-dimensional. 

Formulas  (5)  show  that  L(y,  2)  is  an  invariant  subspace  of  A 
and  permit  finding  the  determinant  of  the  transformation  induced 
in  L(y,  2).  This  determinant  is 


a 

P 


=  a2  +  p2  >  0 


since  p  is  definitely  nonzero  (otherwise  the  root  X  =  a  +  ip  would 
be  real).  The  proof  of  Lemma  1  is  complete. 

6.  Lemma  2.  Given  in  Euclidean  space  En  an  isometric  transfor¬ 
mation  I.  Let  the  suhspace  E'  be  invariant  under  I.  Then  the  ortho¬ 
gonal  complement  E"  of  E'  is  also  an  invariant  subspace. 


TRANSFORMATIONS  OF  EUCLIDEAN  space 


[CH.  IX 


*34 


Proof.  Lemma  2  follows  from  the  fact  that  an  isometric  trans¬ 
formation  is  nonsingular  and  preserves  orthogonality  of  vectors. 
Indeed,  if  x  <=  E',  y  e  E",  then  ( x ,  y)  —  0  and  {lx,  ly)  =  0.  When 
vector  x  runs  through  the  entire  subspace  E' ,  its  image  lx  also 
runs  through  E'  entirely  (see  Subsection  2,  Section  4,  Chapter  VII). 
Hence,  vector  ly  is  orthogonal  to  E'  and  therefore  lies  in  E".  The 
vector  y  in  E"  may  be  taken  arbitrarily.  Thus,  l(E")cz  E". 

7.  Lemma  3.  In  two-dimensional  Euclidean  space,  every  iso¬ 
metric  transformation  I  with  positive  determinant  is  a  rotation 
through  an  angle  0. 

Proof.  Take  an  orthonormal  basis  e\,e2.  Let  the  vector  le\  form 
an  angle  0  with  the  vector  ei.  Since  the  length  of  lei  is  equal  to 
that  of  e,,  it  follows  that  Iet  is  obtained  from  e,  by  a  rotation 
through  the  angle  0.  The  vector  le2  is  orthogonal  to  le\  and  the 
orientation  of  the  new  basis  le i,  le2  is  the  same  as  that  of  the 
original  basis  (since  det  /  >  0).  Hence,  le2  is  obtained  from  e2  by 
a  rotation  through  the  same  angle  0.  The  transformation  I  pre¬ 
serves  the  lengths  of  all  vectors  and  the  angles  between  any  two 
vectors,  and  so  all  vectors  rotate  through  the  same  angle  0.  In  the 
particular  case  of  0  being  a  multiple  of  2n,  the  transformation  is 
identical. 

8.  Proof  of  the  theorem.  Lemmas  1  and  2  permit  decomposing 
the  space  E„  into  a  direct  sum  of  one-dimensional  and  two-dimen¬ 
sional  invariant  subspaces.  By  Lemma  1  there  exists  an  invariant 
subspace  E( i>;  by  Lemma  2  its  orthogonal  complement  E  is  also  in¬ 
variant,  and  Lemma  1  can  again  be  applied  to  it,  etc.  We  thus 
obtain 


£'  =  £'(n®£,(2)©  ...  @E(P )  (7) 

where  p  —  n  —  k  ( k  is  the  number  of  two-dimensional  subspaces 
in  the  sum  (7) ). 

We  assume  that  in  the  right  member  of  (7)  first  come  one-di¬ 
mensional  subspaces  in  which  /  has  eigenvalues  +1,  then  one¬ 
dimensional  subspaces  in  which  the  eigenvalues  are  equal  to  — f, 
and  finally  two-dimensional  subspaces  in  which  there  are  no  eigen¬ 
values  (and  the  determinant  of  the  induced  transformation  is  posi¬ 
tive  in  accordance  with  Lemma  1).  The  transformations  induced 
in  the  two-dimensional  subspaces  of  (7)  are  rotations  through  the 

angles  0 . 0/,  by  Lemma  3.  In  each  of  the  subspaces  E{X),  ... 

. . . ,  £(fJ)  choose  an  orthonormal  basis.  Their  union  will  yield  the 
orthonormal  basis  e\,  . . . ,  en  of  En  since  all  £U)  are  pairwise  or¬ 
thogonal.  Relative  to  the  basis  eu  ....  en,  the  transformation 


A  RIGID  BODY  WITH  ONE  FIXED  POINT 


335 


matrix  becomes 

o 


II 0  '  /., 

Noting  that 


cos  n 

—  sinn 

-1 

0 

sin  n 

cos  n 

0 

-1 

(8) 


(9) 


we  can  replace  the  even  number  of  minus  ones  on  the  diagonal 
of  matrix  (8)  by  one  half  that  number  of  submatrices  of  type  (9) 
Geometrically,  this  means  that  the  product  of  two  reflections  of 
the  plane  with  respect  to  mutually  perpendicular  straight  lines  is 
equal  to  a  rotation  of  the  plane  through  the  angle  n. 

If  the  number  of  minus  ones  is  odd,  one  of  them  will  fail  to 
enter  into  a  submatrix  of  type  (9)  and  it  can  then  be  shifted  to 
the  start  of  the  diagonal  by  renumbering  the  basis  vectors. 

Then  matrix  /  will  assume  the  form  (1)  (the  number  k  will  ge¬ 
nerally  change  compared  with  formula  (8)  due  to  the  appearance 
of  new  submatrices  of  type  (9)).  The  theorem  is  proved. 


§  9.  The  motion  of  a  rigid  body  with  one  fixed  point 

1.  Consider  in  three-dimensional  Euclidean  space  the  motion  of 
a  rigid  body  with  one  fixed  point  O,  which  we  take  for  the  origin 
with  the  orthonormal  basis  eit  e2<  e3. 

At  the  initial  instant  of  time,  let  an  arbitrary  point  of  the  body 
be  at  M,  and  during  time  t  move  to  point  Mt.  Set 

OM  —  x,  OMt  —  y 

and  denote  by  I{t)  the  transformation  that  associates  with  vector  x 
a  vector  y: 


y=l(t)x 


336 


TRANSFORMATIONS  OF  EUCLIDEAN  SPACE 


fCH.  IX 


Geometrically,  the  motion  of  a  rigid  body  means  that  every 
rectilinear  segment  formed  by  points  of  the  body  is  carried,  via 
the  motion,  into  a  rectilinear  segment  of  the  same  length. 

It  is  therefore  possible  to  construct  a  variable  orthonormal 
basis  e\t,  e2t,  e3t  moving  together  with  the  bo.dy,  in  which  the  coor¬ 
dinates  of  the  vector  OMt  preserve  constant  numerical  values. 

For  every  fixed  t,  the  change  from  basis  e]t  e2,  e3  to  basis  eu,  e2t, 
e3t  is  specified  by  an  orthogonal  matrix.  From  the  foregoing  it  fol¬ 
lows  that  the  transformation  /(/)jS  linear  and  isometric  for  every  t. 
Relative  to  the  basis  eu  e2,  e3,„tliis  transformation  is  written  thus: 

Uk  ==  S  ! ha 

Suppose  the  components  //iQ.  (t)  are  differentiable  functions  of 
the  time  t. 

Let  us  find  the  distribution  of  linear  velocities  v  of  points  of  the 
body  at  an  arbitrary  time  t.  In  other  words,  we  wish  to  find  v  for 
every  point  M,,  that  is,  v  as  a  function  of  y.  For  every  point  we 
have 

v  =  Vlet  +  v2e2  +  v3e3  =  -^r 

Thus 

^ k  ^  |  rft  I ka  (0  | 

a 

This  can  be  written  symbolically  as 

v  =  I'tx 

where  l\  is  a  linear  transformation  whose  matrix  is  obtained  by 
differentiating  the  elements  of  matrix  /  with  respect  to  the  argu¬ 
ment  t.  Noting  that 

x  =  r'tj  =  Iy 

we  obtain  the  desired  function  as  a  linear  transformation: 

v=  l'tiy 

r  * 

2.  Let  us  investigate  the  transformation  A=  ItI  in  more  detail. 
* 

We  find  the  adjoint  A.  Take  advantage  of  the  fact  that  in  an  ortho- 
normal  basis  it  suffices  to  take  the  transpose  of  the  matrix  in 
order  to  pass  to  the  adjoint  transformation,  while  taking  the  trans¬ 
pose  of  a  product  of  two  matrices  is  performed  via  the  familiar 
formula:  (//'/*)*  =  (/*)*(/,')*.  Taking  the  transpose  of  a  transpose 
returns  the  matrix  to  its  initial  form,  and  the  differentiation  of  ele¬ 
ments  of  the  matrix  with  respect  to  t  is  clearly  commutative  with 


A  RIGID  BODY  WITH  ONE  FIXED  POINT 


337 


the  transpose  operation.  We  therefore  have  the  matrix  equations 

a' =(/;/*)• =/(n:  a) 

On  the  other  hand,  we  have  /(/)•/(/)  =  E  for  transformations 

and  also  the  equation 

l(t)-rH)  =  E  (2) 

for  their  matrices,  whicli  hoi d  identically  with  respect  to  t.  It  is 
possible  to  prove  that  a  product  of  matrices  can  be  differentiated 
by  the  same  rule  as  a  product  of  functions,  and  so  from  (2)  we 
have 

(/ /*>;=/;/*+ /(/*>; =£;=-•(>  o) 

From  (1)  and  (3)  it  follows  that 


A  +  A  =  0 

We  have  thus  established  that  the  linear  transformation 


v  =  Ay 

is  skew-adjoint  ( A  —  — A). 

In  Section  6  it  was  shown  that  a  constant  skew-adjoint  trans¬ 
formation  yields  an  instantaneous  distribution  of  velocities  of  a 
body  rotating  with  a  constant  angular  velocity  about  a  fixed  axis 
and  can  be  represented  by  the  formula 

u  =  [«  X  !/] 

In  the  case  at  hand,  the  .transformation  A  and  the  vector  a 
depend  on  the  time. 

Conclusion.  When  a  rigid  body  is  in  motion  with  one  fixed 
point  0,  the  field  of  instantaneous  linear  velocities  of  its  points  is 
at  every  instant  of  time  the  same  as  if  the  body  were  in  rotation 
about  an  axis  with  constant  angular  velocity,  but  the  axis  and  the 
angular  velocity  depend  on  the  choice  of  time. 

That  is  why,  in  mechanics,  one  speaks  of  the  “rotation  of  a  body 
about  a  fixed  point”  and  not  the  “motion  of  a  body  with  a  fixed 
point”. 

The  vector  a  =  a(t)  is  called  the  angular  velocity  of  instanta¬ 
neous  rotation  of  the  body.  The  straight  line  passing  through  O 
in  the  direction  of  a(t)  is  known  as  the  instantaneous  axis  of  ro¬ 
tation  of  the  body. 

How  the  instantaneous  axis  of  rotation  changes  with  time  is 
clearly  seen  in  the  case  of  a  spinning , top. 


338 


TRANSFORMATIONS  OF  EUCLIDEAN  SPACE 


[CH.  IX 


§  10.  The  curvature  and  torsion  of  a  space  curve 

1.  Given  a  curve  S  in  three-dimensional  Euclidean  space.  For 
the  sake  of  pictorialness,  we  asume  that  S  is  the  path  of  a  point  M 
moving  with  unit  velocity,  that  is  to  say,  during  unit  time  it  tra¬ 
verses  an  arc  of  S  whose  length  is  equal  to  unity.  Under  this 
condition,  the  time  spent  is  numerically  equal  to  the  length  of  the 
path  traversed.  Denote  the  length  by  s  and  view  it  as  an  indepen¬ 
dent  argument. 

Denote  by  i  the  vector  of  instantaneous  velocity  of  M.  We  can 
consider  it  to  be  a  unit  vector: 

(t,  /)  =  1  (1) 

It  lies  along  a  tangent  to  the  curve  S  in  the  direction  of  motion 
and  is  called  the  tangent  vector. 


t 

Fig.  63 


If  the  line  is  not  straight,  then  t  changes  its  direction  in  space. 
Therefore,  the  point  M  experiences  an  acceleration  equal  to  the 
derivative  of  the  velocity  vector,  or  t's. 

It  can  be  proved  that  the  scalar  product  of  vectors  is  differentiat¬ 
ed  by  the  same  rule  as  a  product  of  functions.  Differentiating  (1) 
with  respect  to  s,  we  find  that  the  acceleration  is  orthogonal  to  the 
velocity: 

(t,  t)'s  =  2  (/,  t')  =  0 

Putting  k(s)  —  \ts\  and  assuming  that  k(s)^=  0,  we  introduce 
the  unit  vector  n,  whicli  is  coincident  in  direction  with  t's  (Fig.  63). 
Then 

t's  =  kn  (2) 

k  is  called  the  curvature  of  the  curve  at  the  given  point  M.  By 
definition,  k  0.  Vector  n  is  called  the  principal  normal  vector, 
while  the  plane  passing  through  M  parallel  to  vectors  t  and  n  is 
termed  the  osculating  plane  of  the  space  curve  S  at  the  point  M. 

Let  us  construe!  the  unit  vector  6  =  [/  X  ni  It  's  perpendicular 
to  the  osculating  plane  and  is  called  the  binormal  vector  at  the 
point  Af  (Fig.  63). 


J  10]  CURVATURE  AND  TORSION  OF  A  SPACE  CURVE  339 

For  every  value  of  the  argument  s,  the  triple  of  unit  vectors  t , 
n,  b  forms  an  orthonormal  basis  that  is  connected  in  a  natural 
manner  with  the  geometric  properties  of  the  curve  in  the  neigh¬ 
bourhood  of  M.  The  basis  /,  n,  b  is  called  a  trihedral  (or  moving 
trihedral). 

2.  Lay  off  the  vectors  of  the  movigg  trihedral  in  space  from  the 
fixed  point  0.  Then  as  the  argument  s  varies,  the  trihedral  rotates 
about  0  as  a  rigid  body. 

The  velocity  vector  of  instantaneous  rotation  of  a  moving  tri¬ 
hedral  is  called  the  Darboux  vector.  Denote  it  by  d  =  d(s).  For 


an  arbitrary  vector  u  rigidly  anchored  to  the  moving  trihedral  we 
have,  by  the  results  of  the  preceding  section, 


In  particular 


u's  —  [d  X  «] 
ts  =  \dXt] 


(3) 

(4) 


Now  let  us  see  how  the  Darboux  vector  is  located  relative  to 
the  trihedral.  We  introduce  the  notation  o=(d,  t)  and  from  (4) 
find  the  projections  of  vector  d  on  the  directions  n  and  b.  To  do  so, 
substitute  into  (4)  the  expansion 

d  =  at  4-  Xn  +  \ib 

with  undetermined  coefficients  X,  p.  Noting  that  t's  =  kn,  we  get 
kn  =  a  [/  X  /]  +  fc  [n  X  t\  +  n  [b  X  /J  =  —  U>  + 

whence  X  =  0,  p  =  k,  and  so 

d  —  at  -f-  kb 


,(Fig.  64). 


(5) 


340 


TRANSFORMATIONS  OF  EUCLIDEAN  SPACE 


[CU.  IX 


The  function  o  =  a(s)  is  called  the  torsion  of  the  curve  S. 
Putting  (5)  into  (3),  we  obtain 

u's  —  a  [f  X  u)  +  k  [b  X  u]  (6) 

Formula  (6)  shows  that  the  instantaneous  rotation  of  the 
trihedral  can  be  decomposed  into  a  sum  of  two  rotational  motions: 
about  the  tangent  and  about  the  binormal.  The  first  component 
has  an  angular  velocity  equal  to  the  torsion  of  the  curve,  the 
second,  an  angular  velocity  equal  to  the  curvature  of  the  curve. 
The  angular  velocity  of  the  total  rotation  of  the  trihedral  is  |rf|  = 
=  ^kr+~&\ 

3.  Putting  the  vectors  n  and  b  into  (6)  in  place  of  u,  we  find 
the  expansion  of  their  derivatives  with  respect  to  the  basis  t,  n,  b. 
Together  with  (2)  these  expansions  constitute  the  so-called  Frenet 
formulas: 

ts  =  kn, 

n's  =  —  kt  -+-  ab , 

b's  —  —  an 

which  are  important  in  the  theory  of  curves. 

§  II.  The  decomposition  of  an  arbitrary  linear  transformation  into 
the  product  of  a  self-adjoint  and  an  isometric  transformation 

1.  The  purpose  of  this  section  is  to  represent  any  linear  trans¬ 
formation  in  Euclidean  space  in  the  form  of  a  composition  of  a  self- 
adjoint  transformation  and  an  isometric  transformation. 

2.  Definition.  A  self-adjoint  transformation  A  is  said  to  be  non- 
negative  if  (Ax,  x)>  0  for  any  x. 

Lemma  I .  If  a  self-adjoint  transformation  is  nonnegative,  then 
all  the  roots  of  its  characteristic  polynomial  are  nonnegative. 

Remark.  The  fact  that  the  transformation  is  self-adjoint  is  very 
essential,  for  then  we  are  positive  that  all  the  characteristic  roots 
are  real. 

Proof.  Let  A.  be  a  characteristic  root  and  x  the  corresponding 
eigenvector.  Then  Ax  =  \x  and 

(Ax,  =  *)  =  MI*IP>0 

since  (Ax,  x)  ^  0,  whence  \  ^  0. 

Lemma  2.  If  A  is  a  nonnegative  self-adjoint  transformation, 
then  there  is  a  nunnegative  self-adjoint  transformation  B  such  that 
A  =  BB. 


DECOMPOSITION  OF  A  LINEAR  TRANSFORMATION 


141 


*  II) 


Remark.  The  transformation  B  is  called  the  square  root  of  trans¬ 
formation  A. 

Proof.  Because  A  is  self-adjoint,  there  is  an  orthonormal  basis 
in  which  matrix  A  has  diagonal  form: 


X 


i 


A  = 


0 


0 


By  Lemma 
numbers. 

Relative  to 
the  matrix 


1,  all  Xi  ^  0  and  so  -y%  are  real  nonnegative 
the  same  basis,  determine  the  transformation  B  by 


V*. 

0 

0 

V  K 

The  basis  is  orthonormal,  matrix  B  is  diagonal,  and  so  trans- 

* 

formation  B  is  self-adjoint  (B  =  B).  From  the  formula  for  B  it 
is  clear  that  for  the  matrices  we  have  the  relation  BB  —  A  and  so 
the  same  relation  holds  for  the  transformations.  Write  out  y  =  Bx 
in  terms  of  coordinates: 

[  V\  —  V^i  *i> 


l&i  =  V  K  Xn 

whence 

(Bx,  x)  =  (y,  x)  =  xxyx  +  . . .  +  xnyn  =  *jxx  x]+  •  •  •  +^Kxl^° 

so  that  the  transformation  Bx  is  nonnegative.  Lemma  2  is  proved. 

Remark.  Consider  the  scalar  product  (Ax,  x) .  If  this  quadratic 
form  is  positive  definite,  then  the  transformation  A  is  said  to  be 
positive  definite  or  positive.  In  that  case  A  is  nonsingular  and  all 
its  characteristic  roots  are  positive.  From  the  proof  of  Lemma  2  it 
is  evident  that  the  square  root  of  a  positive  transformation  is  also 
a  positive  transformation. 


3.  In  this  section  we  will  denote  an  adjoint  transformation  by 
an  asterisk  placed  at  the  side  of  the  symbol  (as  in  the  case  of  the 
transpose  of  a  matrix)  and  not  on  top.  Thus  A*  is  the  adjoint 
of  A.  This  symbolism  is  more  convenient  in  computations  but  we 
must  bear  in  mind  that  if  the  same  letter  A  is  used  to  denote  the 


342 


TRANSFORMATIONS  OF  EUCLIDEAN  SPACE 


[CH.  IX 


matrix  of  the  transformation  A,  then  the  transpose  A*  of  the  matrix 
will,  generally  speaking,  be  the  matrix  of  the  adjoint  transforma¬ 
tion  A*  only  relative  to  an  orthonormal  basis. 

Later  on  we  will  need  identities  that  hold  both  for  matrices  and 
for  transformations: 

(1)  (AB)*  =  B*A*, 

(2)  ( A *)*  =  A, 

(3)  (AB)-' =  B-'A~\ 

(4)  (>!•)-»  =(/H)* 

We  first  have  to  verify  the  truth  of  these  formulas  for  matrices 
(see  Sections  2,  3,  Chapter  II)  and  then  consider  transformations 
in  an  orthonormal  basis.  Relative  to  such  a  basis,  the  above  for¬ 
mulas  for  transformations  follow  immediately  from  the  matrix 
equations. 


4.  Theorem.  For  every  nonsingular  linear  transformation  A  in 
n-dimensional  Euclidean  space  En  there  exists  a  self-adjoint  trans¬ 
formation  B  and  an  isometric  transformation  I  such  that 

A  —  IB  (1) 

Remark.  Similarly,  there  exist  a  self-adjoint  transformation  B i 
and  an  isometric  transformation  I\  such  that  A  =  BJX. 

Proof  of  the  theorem.  Consider  the  transformation  A*A.  It  is 
self-adjoint: 

(AVI)*  =  A*  (A*)*  =  AM 


It  is  nonsingular  since  A  is  nonsingular  by  hypothesis.  Besides, 
the  transformation  A*A  is  positive  definite: 

(A' Ax,  x)  =  (Ax,  Ax)  =  ||  Ax  |[2  >  0 

if  x  #=  0.  By  Lemma  2  we  can  take  the  square  root  of  A*A: 

-yJ~A*A  =  B,  A*  A  =  BB 


where  B  is  a  positive  definite  self-adjoint  transformation.  And  so 


Putting 


A  =  (A')~lBB 
I  =  (AT'B 


we  get  the  following  representation  for  A: 

A  —  IB 


It  remains  to  prove  that  I  is  an  isometric  transformation.  To  do 
so,  compute  /*  using  the  self-adjointness  of  B: 

I*  =  B"  ((A*)-  ')*  =  BA~l 


STRAIN  TENSOR  AND  STRESS  TENSOR 


343 


S  12) 

Furthermore  we  have 

77=  BA~'  (AT'B  =  B  (A*A)~'B 


By  the  construction  of  the  transformation  B, 

(A'A)~'  —  (BB)~'  —  B'B~' 
so  that  * 

1*1  —  BB~'  B~'  B  —  E 

whence  follows  the  isometric  nature  of  the  transformation  /.  The 
proof  is  complete. 

5.  We  shall  call  the  self-adjoint  transformation  B  in  the  ex¬ 
pansion  (1)  the  essential  part  of  transformation  A. 

§  12.  Applications  to  the  theory  of  elasticity. 

The  strain  tensor  and  the  stress  tensor 

1.  Consider  a  continuous  elastic  medium.  Fix  a  point  O  in  it 
and  take  a  certain  volume  containing  the  point.  Suppose  that  in 
the  absence  of  external  forces  this  volume  is  a  sphere  U  with 
centre  at  O  and  that  under  the  action  of  external  forces  it  is  de¬ 
formed  and  displaced.  However,  if  we  disregard  parallel  displace¬ 
ment  in  space,  we  can  regard  0  as  being  fixed.  An  arbitrary 
point  M  in  the  sphere  U  is  described  by  the  vector  x  =  OM.  Sup¬ 
pose  that  as  a  result  of  deformation,  M  moves  to  position  M'.  Put 
OM'  =  y.  Experiment  shows  that 

y  =  Cx  +  r(x)  (1) 

where  C  is  a  nonsingular  linear  transformation  with  positive  de¬ 
terminant,  and  for  small  x  the  vector  r  =  r(x)  is  an  infinitesimal 
of  higher  order,  that  is, 

Hm  iTrr  =  0  (2) 

Ul-*o  1*1 

If  U  is  small,  then  the  vector  r  may  be  disregarded,  and  then  in 
place  of  the  transformation  (1)  we  can  consider  the  linear  trans¬ 
formation 

y  =  C  v  (3) 

Remark.  If  condition  (2)  is  observed,  the  linear  transforma¬ 
tion  (3)  is  called  the  differential  of  the  nonlinear  transforma¬ 
tion  (1). 


344 


TRANSFORMATIONS  OF  EUCLIDEAN  SPACE 


:ch.  ix 


Generally,  for  a  given  deformation  of  the  elastic  medium,  the 
linear  transformation  C  depends  on  the  choice  of  the  point  0. 

2.  By  the  theorem  proved  in  the  preceding  section,  the  linear 
transformation  C  may  be  decomposed  into  two  factors: 

C  =  /C 

where  C  is  a  self-adjoint  transformation  (the  essential  part  of  C) 
and  /  is  an  isometric  transformation  with  det  /  =  +  1. 


The  transformation  C  characterizes  the  deformation  of  the  elastic 
medium  near  the  point  0.  It  constitutes  a  compression  along  three 
mutually  perpendicular  directions  and  therefore  carries  the  sphere 
U  into  an  ellipsoid  V  (Fig.  65). 

The  transformation  /  characterizes  a  rotation  of  V  as  a  rigid 
body  about  the  point  0  (Fig.  66). 

In  most  actually  encountered  cases  the  ellipsoid  V  differs  but 
slightly  from  the  sphere  U.  For  example,  in  metals  under  loads 
that  do  not  go  beyond  the  limits  of  elastic  deformations,  the  semi¬ 
axes  of  the  ellipsoid  V  ordinarily  differ  from  the  radius  of  the 
sphere  U  by  only  a  fraction  of  a  per  cent.  For  this  reason,  the 
transformation  C  is  close  to  the  unit  transformation  E  and  is 
represented  as  a  sum: 

C  =  E  +  B 

Here,  B  (as  can  readily  be  demonstrated)  is  also  a  self-adjoint 
transformation. 

Let  <?i,  e2,  e3  be  an  orthonormal  basis  relative  to  which  we  write 
the  matrix  B  =  j|  6,j  !i. 

The  doubly  covariant  tensor  b a  is  called  the  strain  tensor. 

3 

The  quantity  0  —  £  b{l  (the  trace  of  the  operator  B)  is  called 

/—i 

the  coefficient  of  volume  change.  This  name  is  due  to  the  fact 
that  the  eigenvectors  of  operator  B  are  small  and  the  ratio  of  the 


STRAIN  TENSOR  AND  STRESS  TENSOR 


345 


§  121 

volume  of  the  ellipsoid  V  to  the  volume  of  the  sphere  U  is  ap¬ 
proximately  equal  to  1  -f  0  (to  within  a  quantity  of  the  order  of 
the  square  of  the  eigenvalues  of  operator  B ). 

3.  When  an  elastic  body  undergoes  deformation,  stresses  arise. 
Through  point  0  pass  a  plane  oriented  by  the  unit  normal  n  and 
consider  small  parts  of  the  elastic*  medium  adjoining  this  plane 


Fig.  67 


near  0.  We  replace  the  actual  interaction  of  these  parts  by  the 
forces  applied  to  them. 

The  force  of  action  of  one  part  of  an  elastic  body  on  another 
referred  to  unit  cross-sectional  area  is  called  the  stress  at  the 
given  point  0  for  a  given  orientation  n  and  is  denoted  by  p 
(Fig.  67).  The  dependence  of  p  on  the  direction  n  at  0  can  be  ex¬ 
pressed  to  a  high  degree  of  accuracy  by  the  following  type  of  for¬ 
mula: 

p  =  F(n) 

where  F  =  ||  U a  II  is  a  linear  transformation  that  in  general  is  de¬ 
pendent  on  the  choice  of  point  0. 

The  tensor  fa  is  called  the  stress  tensor. 

4.  If  the  elastic  body  is  homogeneous  and  isotropic,  that  is,  if 
it  has  identical  mechanical  properties  at  all  points  and  in  all  di¬ 
rections,  and  the  deformations  are  small,  then  the  relation  between 
the  strain  tensor  and  the  stress  tensor  is  given  by  Hooke’s  law: 

F  =  A0  E  +  2  pB 


or,  in  coordinates, 

ftk  —  kQ&iil  +  2  iibti. 

Here,  X  and  p  are  constants  that  describe  the  mechanical  proper¬ 
ties  of  the  elastic  medium  and  0  is  the  coefficient  of  volume  change 
(see  Subsection  2). 


Chapter  X 


MULTIVECTORS  AND  OUTER  FORMS 


§  1.  Alternation 

1.  Let  a,  b,  c,  . . .  be  arbitrary  vectors  of  a  given  linear  space  L. 
We  will  consider  products  of  vectors  in  L,  regarding  these  pro¬ 
ducts  as  contravariant  tensors,  which  is  to  say,  as  elements  of 
linear  spaces  7?,,  li,  etc.  Then  L  itself  is  To  (see  Chapter  V). 
Thus,  for  example,  we  have  a  <=  To,  abe  Tl,  abc  e  To,  and  so  on. 

We  use  the  term  alternation  of  a  product  of  vectors  for  an  ope¬ 
ration  which  is  denoted  by  square  brackets  and  is  defined  by  the 
following  equations: 

[a]  =  a, 

[ab\  =  -^-(ab  —  ba), 

[abc]  —  -jj-  ( abc  -f  bca  -+-  cab  —  bac  —  acb  —  cba) 


For  any  number  of  arbitrary  vectors  a\,  a2 . flmeLwe  set 


\a\a2  •  •  •  — * 


m\ 


d[Cl2 


■-Z 


a\a2 


where  J]  means  the  sum  of  all  products  obtained  from  the  pro¬ 
duct  aia2  . . .  am  for  even  permutations  of  the  indices  1,  2,  . . . ,  m; 

£  has  a  similar  meaning  for  odd  permutations.  We  can  write  this 
2 

differently  as 


(i) 


where  the  sum  on  the  right  is  taken  over  all  indices  j\,  /2,  ....  jm, 
each  of  which  independently  runs  through  all  values  from  1  to  m\ 
*  ‘22  l"'  =  +  1  if  jj2  . . .  jm  is  an  even  permutation  of  the  m-tuple 


ALTERNATION 


347 


S  1) 

1,2 . m;  !  i?1  —  —  1  >f  jih  -jm  is  an  odd  permutation 

of  the  m-tuple  1,  2,  m;  ^»=0 if  there  are  two  identical 
values  among  the  values  j\,  j2,  ....  /,„. 

2.  The  alternation  of  a  product  of  vectors  possesses  the  follow¬ 
ing  properties: 

(1)  Linearity  in  any  factor;  for  example,  in  the  first  factor 

[(«<  +  KK  ■  •  •  «m]  =  “  [«i“2  '  '  •  am]  +  PK«2  •  '  '  flm] 

(2)  Skew  symmetry  with  respect  to  any  pair  of  factors;  for 
instance,  the  first  pair: 

[fl,a2a3  . . .  am]  —  —  [a2a,a3  ...  am } 


These  properties  are  immediately  evident  from  the  definition  of 
an  alternation  and  so  we  give  no  proof  here. 

3.  If  there  are  two  identical  vectors  from  among  a\,  a2,  . . . ,  am, 
then  [ata2  . . .  am]  =  0.  Here  the  symbol  0  denotes  the  zero  tensor 
of  the  space  ?.?.  The  assertion  is  clear  since  under  an  interchange 
of  identical  vectors  the  alternation  [aia2  ...  am]  does  not  change, 
but  the  sign  does. 

4.  If  the  vectors  a\,  a2,  ...,  am  are  linearly  dependent,  then 
[flia2  . . .  am]  =  0.  Indeed,  assume  for  the  sake  of  simplicity  that 
a,,  . . . ,  ah  is  the  maximum  independent  subsystem  of  the  system 
of  vectors  ah  . . . ,  a.„:  we  then  have  ah+i  =  aid,  +  . . .  +  ahCih ■  But 
then  from  this,  via  the  property  of  linearity  and  due  to  Subsec¬ 
tion  3,  we  obtain 

[a,  ...  am\  =  a!  [a,  ...  a*a,  ...  a  J  + 

...  +  a*  [a,  . . .  akak  . . .  am]  =  a,  •  0  +  •  •  •  +  a*  •  0  =  0 

5.  Furthermore  we  assume  that  the  given  space  L  is  n-dimen- 
sional. 

6.  Then  if  m  >  n  it  follows  that  1«|U2  . . .  a,„J  =  0  (this  asser¬ 
tion  follows  directly  from  Subsection  4). 

7.  Let  x  be  an  arbitrary  tensor  in  Tn.  Bv  the  definition  of. To, 
the  tensor  x  is  a  sum  of  products  of  certain  vectors  of  L  contain¬ 
ing  k  vectorial  factors  in  each  summand.  We  accordingly  write 


x  =  a'/'ay  ...  ay*  +  ...  +  a\NW2N>  . . .  a]^1 


(2) 


MULTIVECTORS  AND  OUTER  FORMS 


(CH.  X 


34^ 


where  a'J 1  e  L.  We  use  the  term  alternation  of  a  tensor  7"*) 

for  a  tensor  of  the  same  order  that  is  denoted  by  [*]  ([*]  e  To) 
and  is  defined  by  the  equation 

[x]  =  ...  a"']+  . . .  +  . . .  <>]  (3) 

8.  The  alternation  of  a  tensor  is  not  dependent  on  the  mode  of 
notation  in  the  form  (2).  In  other  words,  if  x'  =  x,  then  [ x' ]  =  [*]. 
Thus,  let 

/ = b\%" . . .  c  +  . . .  +  bW . . .  &r  (4) 


then  by  our  definition 

[/] = [«> . . .  *«•]  + . . .  +  [bW . . .  cj  (5) 

Suppose  x'  —  x.  This  means  that  the  sum  (2)  reduces  to  the 
sum  (4)  by  means  of  admissible  replacements  (see  Chapter  V). 
But  by  the  linearity  of  an  alternation,  to  each  admissible  replace¬ 
ment  in  (2)  there  corresponds  precisely  the  same  one  in  (3).  There¬ 
fore,  (3)  reduces  to  (5)  via  the  same  admissible  replacements  that 
reduce  (2)  to  (4).  Thus,  [x'\  =  [*]. 


9.  We  have  the  identity 

[[a,a2  ...  nj]  =  [u,a2  ...  am]  (6) 

which  states  that  two  successive  alternations  of  a  product  of 
vectors  return  it  to  the  original  alternation. 

Let  us  convince  ourselves  that  (6)  holds  in  two  elementary 
c^ses:  m  —  1  and  m  =  2.  Here  the  identity  is  obvious: 

M  =  [a.], 

[[a,a2]]  =  ( [a,a2]  —  [a2a,] )  =  (a,a2  —  a2a,)  —  jf  (a^,  —  a^)) 

===  IT  ®2^l)  ~  M 

In  the  general  case,  we  have,  according  to  (1)  and  (3), 

[[a,a2  . . .  am]]  =  Lm  [«/,  •  •  •  a/m]  (7) 

But  it  is  easy  to  see  that  in  the  sum  on  the  right,  all  terms  are 
the  same  and  each  of  them  is  equal  to  [aiQ2  . . .  am].  Indeed,  by 
skew  symmetry  with  respect  to  any  pair  of  upper  indices  of  the 
quantity  d(' '  and  because  of  skew  symmetry  with  respect  to 
any  pair  of  indices  of  the  alternation  [a/j  . . .  a/m],  in  each  term  of 


ALTERNATION 


349 


§  n 

the  sum  (7)  we  can  interchange  any  pair  of  indices  in  the  m-tuple 
/i  •  •  •  jm,  without  altering  the  term.  This  means  that  without  alter¬ 
ing  any  of  the  terms  we  can  arrange  all  indices  in  natural  order. 
Since  6j | Z  —  1,  it  follows  that  every  term  reduces  to  [a\a2  . . .  am]. 
Since  there  are  ml  terms  in  the  sum  (7),  it  follows  that  (7)  im¬ 
plies  (6). 

Remark.  It  is  now  clear  why  it  is  advantageous,  in  the  defini¬ 
tion  of  an  alternation,  to  take  not  mdtely  the  algebraic  sum  of  the 
products  of  the  vectors  but  this  sum  divided  by  ml. 

10.  Due  to  (.6)  we  have,  for  any  tensor  xe  T *, 

[[*]]  =  M 

11.  If  k  >  n,  then  for  any  x  e  To  we  -have  [a-]  =  0  (see  Subsec¬ 
tion  6). 

12.  Let  eu  e2,  . ..,  en  be  a  basis  in  /..  Then,  as  we  know,  all 
products  et  e.  : . .  e(.  constitute  a  basis  in  Tk  and  for  every 

tensor  x  e  To  we  have  the  expansion 

l"etteia  ■■■  eik 

From  this  we  obtain  an  expression  for  the  alternation  of  tensor  x 
relative  to  the  given  (arbitrary)  basis: 

W  =  ■■■'*[«/ (8) 

13.  From  now  pn  we  will  use  the  symbol  61.'  "  as  having  the 

following  meaning.  We  will  assume  that  the  indices  i . .  4, 

/ 1,  . . . ,  jh  take  on  any  values  from  1  to  n  (where  n  is  the  dimen¬ 
sion  of  L).  The  number  of  all  lotver  (or  upper)  indices,  the  num¬ 
ber  k,  may  be  arbitrary.  If  all  numerical  values  i . .  4  are 

distinct,  then  6,1  "  (*  =  +  1  when  ji  .. .  jh  is  an  even  permutation 

of  the.  set  of  numbers  it,  . . . ,  ih ;  "  {*  =  —  1  if  /'i  . . .  jh  is  an  odd 

permutation  of  the  numbers  i|,  . . . ,  t*.  In  all  other  cases,  dfj  \k= 
=  0.  Thus,  6^'  " '  /fe  =  0  if  among  the  numerical  values  i'i,  ....  4 

(or  j\ . jh)  there  are  two  identical  ones  or  if  among  the  nu¬ 

merical  values  / 1,...,  jh  there  is  one  that  is  absent  from  the  set 
ii,  . . . ,  4  (and  conversely).  In  particular,  fi/j  ” |{*==0  when  k>n. 

According  to  the  definition  just  given,  the  set  of  numbers 
possesses  skew  symmetry  both  in  the  upper  and  -in-  the  lower 


350 


MULTIVECTORS  AND  OUTER  FORMS 


[CH.  X 


indices.  In  other  words,  interchanging  two  upper  indices  or  two 
lower  indices  changes  the  sign  of  "  [*. 

14.  It  is  easy  to  see  that  (1)  implies  the  equation 


where  the  indices  i . .  ik  are  fixed  and  the  sum  on  the  right 

is  taken  over  all  values  of  the  indices  /, .  jk  from  the 

n-tuple  1  to  n.  Hence,  the  right  member  of  (9)  is  a  resolution 
of  the  tensor  [>(.  . . .  J  relative  to  a  basis  in  the  space  T*;  the 

numbers  (for  fixed  i . .  ik)  are  the  components  of 

this  tensor. 


15.  Rewrite  (8)  with  the  aid  of  (9)5 
1 


We  now  introduce  the  notation 


In  other  words,  we  put 
jtl'l  = 

x™=±-{xli-x\ 

xm  =  ±  (x‘Jk  +  Xikl  +  Xkli  -  x>lk  -  xtk>  -  xkJl) 


10) 


By  (10)  we  get 

M  =  •  •  •  eik  (11) 

The  operation  defined  by  (10)  is  called  the  alternation  of  the 

components  of  tensor  x,  or  the  alternation  of  the  indices  j\ . /*. 

This  was  discussed  in  Chapter  V. 

Comparing  (8)  and  (11),  we  see  that  in  order  to  obtain  an  al¬ 
ternation  of  tensor  x  we  can  proceed  in  one  of  two  ways: 

(1)  cither  replace,  in  the  expansion  of  tensor  x,  all  products 
of  the  basis  vectors  ei{  ...  dk  by  their  alternations  ... 

leaving  the  old  coefficients  jtf| 

(2)  or  replace  the  coefficients  x'1'  "(*by  the  corresponding 

alternations  leaving  unchanged  the  products  of  the  basis 

vectors  £*(j  ...  dk.  For  the  sake  of  pictorialness  we  demonstrate 


MULTIVECTORS.  OUTER  PRODUCT 


351 


§2) 

the  equivalence  of  these  two  operations  in  the  case  of  k  =  2. 
If  x  =  Yj  xlJetei,  then 

[x]  =  £  xiJ  [etCf]  =  ]T  x"  ^j-  {ete,  — 

=  irZ*'^'_4  Z 

= ir  Z  x“e‘ci  -  = E  x'll'e‘ei 

16.  Equation  (11)  is  an  expansion  of  the  tensor  [x]  in  terms  of 
a  basis  in  T *.  Hence  the  alternations  x^1  of  the  components 
of  tensor  x  are  the  components  of  its  alternation  [x], 

17.  From  (10)  it  follows  that  the  alternations  x^1'  "1^  of  the 
components  of  an  arbitrary  tensor  x  e  if,  possess  skew  symmetry 
in  any  pair  of  indices. 

18.  Definition.  A  tensor  r(re  7’*)  is  said  to  be  skew  if  [x]  =  x. 
From  this  definition  and  from  Subsection  16  it  follows  that  the 

equation 

xfi-'-ih  —  xV  '•••'*]  (12) 

holds  true  for  the  components  of  a  skew  tensor. 

Thus,  the  components  of  a  skew  tensor  have  skew  symmetry  in 
any  pair  of  indices.  Conversely,  if  the  components  xl,  '‘k  of  any 
tensor  xe  T*  have  skew  symmetry,  then  from  (10)  we  get  (12) 
(via  the  very  same  arguments  used  to  prove  the  identity  (6)  in 
Subsection  9).  Whence  [x]  =  x. 

Thus,  the  definition  of  a  skew  tensor  x  via  the  condition  [x]  =  x 
is  equivalent  to  the  definition  via  the  property  of  the  skew  sym¬ 
metry  of  the  components  (see  Chapter  V,  Section  8). 

Remark.  Since  the  equation  [x]  =  x  holds  for  any  tensor  x  in  Tl0, 
it  follows  that  all  first-order  tensors  are  skew. 

§  2.  Multivectors.  Outer  product 

1.  Let  us  consider  a  set  9  whose  elements  are  all  contravariant 
skew  tensors  of  all  possible  orders  (including  the  first  order)  spe¬ 
cified  over  an  ri-dimensional  linear  space  L. 

Definition.  The  outer  (or  alternate)  product  of  the  skew  tensors 
x,  y,  x  e  Tm,  y  e  Tl<  is  a  tensor  of  the  space  To+‘.  This  is  denoted 
by  the  symbol  x  Ay  and  is  expressed  by  the  formula 

xA  y=  kul  [ xy\ 


0) 


352 


MULTIVECTORS  AND  OUTER  FORMS 


iCH.  X 


The  square  brackets  denote  the  alternation  of  an  ordinary  product 
of  tensors  xy. 

In  the  particular  case  of  two  arbitrary  contravariant  vectors  x, 
y  e  L  we  have 

a:  A  y  =  xy  —  yx 

2.  An  alternation  always  yields  a  skew  tensor  and  therefore 
outer  multiplication  does  not  take  us  outside  the  set  It  is  easy 
to  verify  that  a  collection  of  skew  tensors  of  a  given  order  m  forms 
a  subspace  in  T™.  We  denote  it  by  Gm.  Hence  the  set  9  is  closed 
under  addition  of  tensors  and  their  multiplication  by  scalars. 

Nonzero  tensors  in  %  are  considered  equal  elements  of  this  set 
if  and  only  if  they  belong  to  a  single  space  of  T i"  and  are  equal 
as  elements  of  To".  Besides,  &  includes  zero  elements  of  the  spaces 
T[ "  for  all  natural  m.  Zero  tensors  of  all  orders  are  considered  to 
be  equal  elements  of  the  set  We  will  denote  them  by  the  sym¬ 
bol  0.  Clearly, 

.v  A  0  =  0 

for  any  x^9.  By  Subsection  6,  Section  1,  we  have  xAy  —  0 
if  xs  To,  y  ^  To,  6  +  /  >  n. 

3.  We  will  use  the  term  Grassmann  algebra  *  over  the  space  L 
to  denote  the  set  %  with  the  aforementioned  equality  of  elements 
and  the  defined  operations,  in  this  set,  of  outer  multiplication,  mul¬ 
tiplication  by  a  scalar,  and  also  addition  of  tensors  of  the  same 
order. 

Contravariant  6-order  skew  tensors  viewed  as  elements  of  Gras¬ 
smann  algebra  are  called  contravariant  6-vectors  or  contravariant 
multivectors.  The  number  6  is  called  the  order  of  the  multivector. 
The  element  0  is  called  the  zero  multivector.  To  the  order  of  the 
zero  multivector  we  can  assign  any  natural  value.  All  multivectors 


*  This  definition  of  a  Grassmann  algebra  is  inadequate.  The  point  is  that 
in  sets  called  algebras  the  operation  of  addition  is  defined  for  all  elements. 
Therefore  we  should  also  have  defined  addition  of  tensors  of  different  orders 
in  the  set  9.  This  is  done  by  constructing  symbolic  sums  of  elements  of  diffe¬ 
rent  spaces  T%.  Besides,  one  ordinarily  includes  in  9  tensors  of  order  zero,  that 
is  to  say,  scalars  (invariants).  As  a  result,  9  becomes  a  linear  space  of  dimen¬ 
sion  2’1  isomorphic  to  a  direct  sum  of  subspaces  Gm: 

S=G°©G'0  ...  ©O'1 

where  G°  denotes  the  collection  of  tensors  of  order  zero.  We  do  not  carry  out 
this  construction  in  detail  since  we  will  not  be  dealing  with  the  addition  of  ten¬ 
sors  of  di Heron t  orders.  Also  observe  that  the  set  9  is  often  called  a  Grassmann 
algebra  over  the  space  L‘,  which  is  conjugate  to  the  given  space  L.  This  is  be¬ 
cause  the  elements  of  9  may  be  identified  with  multilinear  forms  whose  arguments 
are  vectors  of  L*.  A  general  definition  of  Grassmann  algebra  is  given  in  [2], 


MULTIVECTORS.  OUTER  PRODUCT 


353 


9  2] 

of  order  k  >  n  are  equal  to  the  zero  multivector.  Thus,  all  nonzero 
multivectors  belong  to  the  spaces  To,  To,  . . .,  T[ f. 

4.  Elementary  properties  of  outer  multiplication. 

(1)  (ax)  A  y  =  x  A  (ay)  =  a(x  A  y)  for  any  scalar  a  and  any 
x,  y  e  $,  since  a  numerical  factor  can  be  taken  outside  the  sign 
of  ordinary  multiplication  of  tensofs  and  also  outside  the  sign  of 
alternation. 

(2)  (x  -f-  y)  A  z  —  x  A  z  +  y  A  z,  since  both  ordinary  multipli¬ 
cation  and  alternation  of  a  product  of  tensors  are  distributive  un¬ 
der  addition. 

(3)  Outer  multiplication  is  skew-commutative,  namely, 

x  A  y  =  (~\)kl y  A  x  (2) 

if  x  e  To,  y  s  To- 

Proof.  In  coordinate  (component)  notation  we  have 


x  —  'Zx1' ‘ketj  . . .  eih, 

y='Ey'1'"  V,  •  •  •  etl 

whence 


x  A  y  = 

(k  +  i)\ 
ki  n 

2>- 

••  V*  • 

•  •  eike,l  . 

(3) 

Similarly 

y  A  x  = 

(k  +  i)\ 
k\  n 

Z*'1' 

. .  eheix  . 

•  •  **] 

(4) 

Here,  e\,  . . . ,  en  is  a  basis  in  L,  the  indices  i'i,  . . . ,  ih,  ju  . . . ,  jt  run 
from  1  to  n  independently,  the  summation  is  over  all  indices,  and 
the  square  brackets  denote  alternation,  as  above.  To  carry  the  per¬ 
mutation  of  indices  (t'i,  ...,  ih,  j\,  ....  ji)  into  the  permutation 
(ju  . . . ,  jt,  t'i,  ....  ih)  requires  id  interchanges  of  adjacent  indices. 
Therefore  and  by  virtue  of  the  definition  of  an  alternation,  each 
term  in  the  sum  (4)  differs  from  the  corresponding  one  in  (3)  by 
the  factor  ( — l)ft',  whence  follows  (2). 

Corollary.  If  x  is  a  multivector  of  odd  order,  then  x  A  x  =  0. 

(4)  An  outer  product  is  associative,  that  is,  for  any  multivec¬ 
tors  x,  y,  z  ^  &  we  have 

(x  A  y)  A  z  —  x  A  (;/  A  z) 

To  prove  this  identity  we  will  need  some  preliminary  results. 

5.  Consider  two  arbitrary  sets  of  basis  vectors:  eilt  ...,  eik 
and  e/ ,  ...,  Take  their  alternations  to  get  two  multivectors: 


12  —  661 


35-1 


MULTI  VECTORS  AND  OUTER  FORMS 


[CH.  X 


Multiply  them  by  the  ordinary  rule  for  multiplying  tensors  and 
take  the  alternation  of  the  resulting  product: 


[[‘ 


«'*][*/.  •••  e'tl\ 


=_i_y6v-  “Agp.---  Pi  re 

k\n  Z-. •  ••  'ft0/,  --/,  |>,  • 


en  eK 
“ft  P| 


To  start  with,  suppose  that  there  are  no  identical  indices  among 

I, . is  and  among  /,,  ....  /,.  Then  it  will  suffice,  in  the  sum  (5), 

to  consider  only  those  terms  where  ai  ...  ah  is  a  permutation  of 
the  set  of  indices  4,  . . . ,  4,  and  p,  ...  p,  is  a  permutation  of 

. . ji  (the  remaining  terms  are  equal  to  zero).  But  it  is  obvious 

that  all  such  terms  are  identical  and  for  any  one  of  them  we  have 


6a....  a*6e,...  0,r 
/, ...  ik  /j  •••  //  L  ai 

=  \e 

l\  •"  lk  >\  •••  U  L 


j 


e,  e, 

'ft  0 


M 


=  e. 


V/, 


It  is  also  obvious  that  the  total  number  of  such  terms  is  equal 
to  k\  /!.  Consequently 


[K  . . .  e,J  [e,,  . . .  erf]  =  eike,l  . . .  e/J  (6) 


It  is  now  clear  that  (6)  holds  true  at  all  times  because  if  there 
are  identical  indices  among  t‘i,  . . . ,  4  or  /,,  . . . ,  /,,  then  both  mem¬ 
bers  are  equal  to  the  zero  multivector. 


6.  From  (6)  follows  immediately  the  associative  property  for 
an  outer  product  of  basis  vectors.  Namely,  we  have 

e,Ae/-=2l[eie,\ 

Furthermore 

(e,  A  ey)  A  e*  =  -gfyy  [e,  A  e,,  e*]  =  3!  [[e,e,]  ==  3!  [e.-eye*] 


Similarly 

e,  A  (e,  A  ek)  =  3!  [e,  [eye*]]  =  3!  [e.-eye,.] 

Thus,  (e,  A  ey)  A  e*  =  e,  A  (e;  A  efe)  and  e,  A  ey  A  ek  —  (e,  A  e j)  Aek  — 
—  e,  A  (ey  A  eft)  is  defined.  From  this  and  by  induction  (using  (6)), 
we  get 

e,,  A  e,2  A  ...  A  etm  =  m\  [e,(e,2  . . .  e,J  (7) 

The  outer  product  of  many  basis  vectors  (on  the  left)  may  be  de¬ 
termined,  as  is  usual  in  such  cases,  via  any  successive  combination 


MULTI  VECTORS.  OUTER  PRODUCT 


355 


J  2] 

of  factors.  From  (1),  (6)  and  (7)  it  also  follows  that 
(<?/,  A  ei2A  ...  A  e,k)  A  (o,  A  e/2  A  ...  A  e,) 

=  c,- ,  A  C/.,  A  ...  Ac,kAeliAei2A  ...  Ae/,  (8) 

7.  Let  *  be  an  arbitrary  mul^ivcctor  in  T 

x  =  'Zlxl'-  i'‘c,l  ...  eih  (9) 

By  the  definition  of  a  multivector  we  have  a:  =  [jc]  and,  conse¬ 
quently,  we  can  write 

x  =  Zxi'  -l>[eit  ...  elk\ 
whence  and  due  to  (7) 

*  =  A<?/2A  Aeik  (10) 

The  summation  here  is  over  all  indices,  each  one  running  indepen¬ 
dently  from  1  to  n.  Terms  of  this  sum  where  there  is  at  least  one 
pair  of  identical  indices  are  equal  to  zero.  We  now  consider  all 
terms  corresponding  to  all  possible  permutations  of  some  single 
set  of  distinct  indices.  There  are  k\  such  terms  for  a  given  set  of 
indices  and  they  are  all  equal.  And  so  in  place  of  (10)  we  can 
write 

x  =  *YjX1"'  A  ...  A  eik  (11) 

where  the  starred  sigma  is  the  summation  sign  over  all  sets  of 
indices  t1(  . . . ,  4  provided  i'i  <  4  <  •  •  •  <  ih- 

8.  As  we  know,  all  multivectors  of  a  given  order  k  form  in  Tq 
a  subspace  denoted  by  G*.  From  (11)  follows  an  important  result: 
the  outer  products  eitA  ■■■  Aeik  of  the  basis  vectors  of  space  L 
which  correspond  to  all  possible  sets  of  indices  i\  <  h  <  ...  <  ih 
constitute  a  basis  in  Gft. 

Indeed,  by  (11)  every  multivector  x^Gh  can  be  expanded  in 
terms  of  outer  products  eiy  A  ...  A  e(ft(/(<  ...  <  /*).  On  the 
other  hand,  it  is  easy  to  see  that  these  outer  products  are  linearly 
independent  in  G*.  Suppose  there  are  numbers  x‘l where 
ii  <  h  <■  ■■■  <  k,  for  which  (11)  yields  x  ==  0.  We  determine 
*'i  ‘a  for  any  arrangements  of  the  indices  ii,  ....  4  by  the  con¬ 
dition  of  skew  symmetry;  in  other  words  we  put  x213  —  k  = 
=  — jc1  2  s ...  fe  an(j  so  on  then  in  place  of  (11)  we  can  write  the 
equivalent  expansion  (9),  where  the  summation  is  over  all  in¬ 
dices  and  the  coefficients  x'1  ‘k  are  the  components  of  the  ten- 


12* 


35$ 


MULTIVECTORS  AND  OUTER  FORMS 


[CH.  X 


sor  x.  We  have  x  =  0;  hence  x1'  ‘k  —  0  as  components  of  the 
zero  tensor.  This  proves  the  linear  independence  of  the  outer  pro¬ 
ducts 

ei{  A  et2  A  ...  A  elfc  (/,  <  i,  <  ...  <  ik) 

9.  Corollary.  The  dimension  of  the  subs  pace  Gh  is  equal  to  C *. 
True  enough,  for  the  number  of  all  outer  products  ei  A  A  . . . 

. . .  A  eik,  i’i  <  h  <  •  •  •  <  ik,  is  equal  to  the  number  of  combina¬ 
tions  of  n  elements  taken  k  at  a  time,  which  is  to  say,  C*. 

10.  Let  us  now  return  to  the  problem  of  the  associative  property 
of  an  outer  product  of  any  multivectors.  The  proof  of  this  property 
follows  from  identity  (8).  To  simplify  the  proof,  we  introduce  the 
notation 

e‘\  lk~e‘i  A  e‘i  A  •••  A  eik 

Then  (8)  becomes 

A  <12) 

Consider  arbitrary  multivectors  y^Tl0.  We  can  write 

them  as 

x  =  *  £  x'1  ■"  ‘keil ...  ,k,  y  =  *Yj  y'1 /fe/, ...  /, 

From  this  and  also  from  (12)  we  have 

x  A  y  —  *  £  x'*  V*  ,‘eil ...  ifc/, ...  /f  (13) 

Here,  the  starred  sigma  is  the  sign  of  summation  over  all  sets 
of  indices  i,,  . . . ,  ih  and  / 1,  ...,//  provided  tj  <  i2  <  . . .  <  and 
/i  <  h  <  •  •  •  <  //•  However,  in  the  general  set  i,,  . . . ,  i,„  /,,  ...,/; 
these  indices  may  not  be  arranged  in  increasing  order. 

Let  z  e  T”  be  another  multivector: 

z  =  *£2*i- (14) 


Again  using  (12),  from  (13)  and  (14)  we  find 


(x  A  //)  A  z  =  *  S  x'  " 

■v 

•  v>- 

meix .. 

■  lk'i  ■■ 

•//Ae,,.. 

==  *  2  x!‘ ' 

•v*- 

v 

■  • 

••  V.  • 

.  .  /jS,  ...  sm 

On  the  other  hand, 

x  A  (f/  A  z)  =  *  Z  x'1  ' 

•v 

••  v 

•  A  e/,.. 

=  *  X  X1' 

V*  • 

■■  V.  • 

"Smeix. 

. .  /jS,  ...  sm 

(16) 


131  VECTORS 


357 


$  3] 

The  symbol  in  (15)  and  (16)  is  used  in  the  same  sense  as 
in  (13).  Comparing  (15)  and  (16),  we  obtain  (xAy)Az  — 
—  x  A(y  A  z),  and  the  associative  property  is  proved. 

11.  In  the  usual  manner  we  now  define  the  outer  product  of  any 
number  of  any  contravariant  multivectors:  xAyAzA...Aw. 
For  example,  * 

x  A  y  A  z  A  t  —  ((x  A  y)  A  z)  A  I  =  (  v  A  y)  A  (z  A  0  =  x  A  (//  A  (2  A  t)) 

§  3.  Bivectors 

1.  A  bivector  of  a  space  L  is  a  multivector  of  order  two.  As 
before,  we  deal  in  contravariant  multivectors.  Therefore,  when 
speaking  of  bivectors,  we  have  in  view  multivectors  in  Tg. 

2.  A  bivector  p  is  called  a  simple  bivector  if  it  is  equal  to  the 
outer  product  of  two  vectors: 

p  —  at  A  a2  (1) 

where  ay,  a2  e  L.  If  aq,  a2  are  linearly  dependent,  then  p  —  0.  If 
fli,  a2  are  independent,  then  p  0.  True,  because  if  ay,  a2  are  in¬ 
dependent,  they  can  be  completed  to  form  a  basis  ay,  a2,  a3,  . . . ,  a„. 
But  then  the  outer  products  a,  A  ajt  i  <  /',  constitute  a  basis  in  the 
subspace  of  bivectors  over  L  (see  Subsection  8,  Section  2).  Con¬ 
sequently,  none  of  these  outer  products,  including  ay  A  a2,  can  be 
a  zero  bivector. 

Thus,  the  simple  bivector  (1)  is  zero  if  and  only  if  the  vectors 
a\,  a2  are  linearly  dependent. 

3.  Suppose  p  #  0  and,  hence,  cq,  a2  are  independent.  Then  cq,  a2 
define  in  L  a  two-dimensional1  subspace  L2  which  is  their  linear 
hull:  L2  —  L(ay,  a2).  Let  by,  b2  be  any  two  vectors  in  L2.  Due  to 
the  independence  of  ay  and  a2 ,  we  have 

&1  ==  ®llal  +  al2rt2.  ) 

b2  —  (x^  i  a  ]  a.22a2  ) 

where  a,j  are  numerical  coefficients.  Consider  the  bivectcr 

q  =  f,\  A  b-j  (3) 

From  (2)  and  (3)  we  get,  by  the  rules  of  outer  multiplication, 

q  =  by  A  62  =  (aii«i  +  a, 2a2)  A  (a2,n|  A  a22a2) 

—  a11a22(a,  A  a2)  -f  a,2a>,  (n2  A  a>)  =  (011022  —  012021)^1  A  aj 


Thus 


9  =  Dp,  D  — 


au  a12 
a2(  a22 


(4) 


358 


MUl-TI VECTORS  AND  OUTER  FORMS 


[CH.  X 


That  is,  the  bivector  q  is  proportional  to  the  bivector  p  and  the 
constant  of  proportionality  D  =  det  ||  a,j  ||. 

Conversely,  suppose  that  q  —  bi  A  b2  =  Xp,  where  X  is  a  num¬ 
ber  (X  =¥=  0).  Then  bu  b2  e  L2,  that  is,  we  have  equations  (2),  and 
X  =  D  from  (4). 

Let  us  prove  this  assertion. 

We  consider  three  arbitrary  vectors  cu  c2,  c3.  If  they  are  depen¬ 
dent,  then 

Ci  A  c2  A  c3  =  0  (a) 

(see  Subsection  4  of  Section  1  and  (7)  of  Section  2).  If  C|,  c2,  c3 
are  independent,  then  C\  A  c2  A  c3  0.  Indeed,  the  independent 
vectors  cit  c2,  c3  may  be  completed  to  the  basis  Ci,  c2,  c3,  ....  cn. 
But  then  all  the  outer  products  of  the  type  Ci.Ac,  A  Ci„i\  <  i<  <  i3, 
will  constitute  a  basis  in  G3.  Hence  they  are  all  different  from 
zero.  Thus,  equation  (a)  is  necessary  and  sufficient  for  the  de¬ 
pendence  of  vectors  c\,  c2,  c3.  Now  let  b\  A  b2  —  X(at  A  a2),  where 
X  ¥=  0,  Oi  A  fl2  #  0.  Form  the  outer  product  of  both  members 
by  b |.  On  the  left  we  get  b\  A  b,  A  b2  =  0,  and  so  b]  A  a\  A  a2  —  0. 
From  this,  by  what  has  already  been  said  about  the  arbitrary  vec¬ 
tors  ci,  c2,  c3,  we  conclude  that  b i,  a i,  a2  are  dependent.  This  means 
hi  e  L2  =  L(a\,  a2) .  Similarly  b2^  L2  =  L(ah  a2) .  It  is  then  clear 
that  X  =  D. 

4.  To  summarize,  if  ai  A  a2  0,  then 

b\  A  b2  =  X  (flj  A  ^2),  A,  0  (5) 

if  and  only  if  b\,  b2  are  independent  and  b\,  f»2  e  L2  =  L{a\,  a2). 
Also,  X  =  det  II  a 1|,  where  II  a,-;  II  is  a  matrix  composed  of  the 
components  of  the  vectors  b  1,  b2  relative  to  the  basis  a.\,  a2. 

5.  In  particular 

b\  A  b2  ==  a,  A  «•> 


if  and  only  if  b\,  b2^.  L2  —  L(a  1,  a2)  and 

det  ||  ct,,  ||=1 

6.  The  subspace  L2  —  L(a\,  a2)  is  called  the  subspace  of  the 
bivector  ai  A  a2.  We  say  that  the  bivector  a \  A  a2  lies  in  the  sub¬ 
space  L2.  We  also  say  that  a\  A  a2  is  the  direction  bivector  of  this 
subspace  (just  like  an  ordinary  vector  lying  on  a  straight  line  is 
said  to  be  the  direction  vector  of  that  line). 

7.  Suppose  a  linear  space  L  is  equipped  with  a  Euclidean  metric. 
For  the  sake  of  pictorialness,  imagine  the  nonzero  bivector  b\  A  b2 
in  the  form  of  an  oriented  parallelogram  constructed  on  an  ordered 


5  3] 


BIVECTORS 


359 


pair  of  vectors  b\,  b2  (Fig.  68).  The  area  of  this  parallelogram,  will 
be  called  the  area  of  the  bivector  b\  A  b2.  A  bivector  with  area 
unity  is  termed  a  unit  bivector. 

Now  let  fl|  Aa2  be  a  unit  bivector.  Then  det  ||  an  ||  =  ±  a, 
where  o  is  the  area  of  the  bivector  b t  A  b2,  and  relation  (5)  be¬ 
comes 

bx  A  &2  =  *F  o(i/|  A  o2)  (6) 

The  signs  ±  correspond  to  cases  Where  the  bivector  b i  A  b2  is 
oriented  in  L2  like  the  bivector  at  A  a2  or  has  the  opposite  orienta¬ 
tion. 


If  we  assume  that  the  subspace  L2  itself  is  oriented  by  an  or¬ 
dered  pair  of  vectors  ait  a2,  then  in  place  of  (6)  we  can  write 

b\  A  b2  =  S  (a |  A  a2)  (7) 

where  5  =  det  ||  a,-j  II  is  the  oriented  area  of  the  bivector  bt  A  b2. 

8.  Thus,  simple  bivectors  lying  in  L2  are  depicted  in  the  form 
of  oriented  parallelograms  of  the  subspace  L2.  By  (6)  or  (7), 


Fig.  69 


parallelograms  of  equal  area  and  identical  orientation  in  L2  depict 
the  same  bivector  (Fig.  69). 

.  9.  Assuming  L  to  be  a  Euclidean  space,  take  in  L  an  ortho¬ 
normal  basis  eu  . . . ,  en.  Consider  arbitrary  vectors  b\,  b2  e  L.  We 


360 


MULTIVECTORS  AND  OUTER  FORMS 


[CH.  X 


have 

bi  =  *!e.  +  xle2  +  •  •  •  +  x"en, 
b2  —  x\ex  +  *\e2  +  •  •  •  “I”  x2en 

Choose  a  pair  of  distinct  basis  vectors  eit  e2  and  assume  i  <  /. 
They  define  a  two-dimensional  coordinate  plane,  which  we  denote 
by  Eij  (to  be  more  exact,  we  should  say  that  is  a  two-dimen¬ 
sional  subspace,  namely  L{eiy  ej)).  We  consider  that  the  plane  is 


oriented  by  the  bivector  e*  A  eh  i  <  /.  We  use  the  term  projection 
of  the  bivector  b\  A  b2  on  £,,•  for  the  bivector 

(*'<?,  +  x'ej)  A  (x‘2el  +  x!2e,)  =  Si>ei  A  e, 

where  S11  —  —  x[xl2  is  the  oriented  area  of  a  parallelogram 

constructed  in  Ei2  on  the  vectors  x\ei  +  x\ej  and  xi2el  -f-  x^e/t  that 
is,  on  the  projections  of  b\  and  b2.  Also,  by  the  rules  of  outer 
multiplication  we  find 

bi  A  b2  —  *  X  Slle,  A  e,  (8) 

where,  as  before,  the  starred  sigma  signifies  summation  with  the 
proviso  that  i  <  /. 

From  (8)  we  conclude  that  in  an  orthonormal  basis  the  compo¬ 
nents  of  the  simple  bivector  b\  A  b2  are  oriented  areas  S'i  of  its 


BIVECTORS 


361 


§  3] 


projections  on  the  two-dimensional  coordinate  planes  En,  i  <  / 
(see  Fig.  70,  where  n  —  3). 


10.  In  the  three-dimensional  case,  the  basis  consists  of  three  vec* 
tors  ei,  e2,  e3  and  the  sum  (8)  also  lias  three  terms: 

6,  A  b2  =  S12(e,  A  e2)  +  S':’(e,  A  e3)  +  S*(e2  A  e3) 

Therefore,  with  each  bivector  b i  A  b2  of  three-dimensional  Eucli¬ 
dean  space  L  we  can  associate  a  vector  c  of  the  same  space  L, 
putting 

c  =  S23e,  -  S,3e2  +  S'% 


The  vector  c  is  determined  by  the  bivector  b\  A  b2  invariantly 
with  respect  to  changes  to  other  orthonormal  bases  with  the  same 


orientation  as  that  of  the  basis  e\,  e2,  e3.  It  is  easy  to  see  that  the 
vector  c  is  the  vector  product  of  b i  by  b2  (Fig.  71): 

c  =  [6|  X  ^2] 

11.  Let  us  take  a  look  at  arbitrary  bivectors  (not  necessarily 
simple)  in  n-dimensional  linear  space  L.  For  the  time  being  we  do 

not  assume  a  Euclidean  metric  in  L.  Let  . . .  be  a  basis 

in  L.  Then  ei  A  e2,  <?„_)  A  e„  constitute  a  basis  in  the  space 

of  bivectors  over  L.  For  any  bivector  11  we  have  the  expansion: 

u  =  *  X  w"e<  A  ej  =  irl2<?,  A  e>  -f-  uVie^  A  <*3  +-  ...  +  u'ne ,  A  en 

+  u23e2  A  e3  -f  ...  -f  u2ne2  A  en  +  ...  +  un~'  nen-i  A  en  (9) 

Thus,  every  bivector  decomposes  into  a  sum  of  simple  bivectors. 


12.  Due  to  the  relations  et  A  ej  —  e,Cj  —  CjC,,  we  can  replace  (9) 
with  an  expansion  of  u  relative  to  a  basis  in  7»: 

«  =  Z  u-^eie, 

—  0  •  e,e,  +  u,Je,e2  +  . . .  +  u,ne,e„ 

+  h21  e2e,  +  0  •  e,e2  +  . . .  +  u2ne2en 


-f  «'%e,  +  .,.  -J--  un~'  ne„~ien  +  0  •  e„en  (10) 


362 


MULTIVECTORS  AND  OUTER  FORMS 


[CH.  X 


Here,  for  i  >  /  we  have  uij  =  — The  diagonal  contains  zero 
components  (with  coefficients  u”  =  0). 

13.  The  rank  of  the  matrix  H  «*'•’  ||  composed  of  the  coefficients  of 
expansion  (10),  that  is,  of  the  components  of  the  bivector  u  rela¬ 
tive  to  the  basis  eiei  is  called  the  rank  of  the  bivector  u.  We  de¬ 
note  the  rank  by  r(r  =  rank  ||  ui}  II). 

14.  The  rank  of  the  bivector  u  is  also  the  rank  of  the  bilinear 

form  «(g,  =  /  with  covariant  arguments  g=(gi . g„), 

rj  =  (r)t,  ....  iin),  g,  r)  e  L*.  This  form  is  a  complete  contraction 
and  thus  an  invariant,  and  so  its  rank  is  invariant,  whence  fol¬ 
lows  the  invariance  of  the  rank  of  the  bivector,  that  is,  the  rank 
is  independent  of  any  choice  of  basis  e\ . e„eL 

15.  Let  us  consider  a  linear  operator  U,  which  associates  with 
an  arbitrary  vector  g  in  L*  its  right-hand  contraction  with  the  bi¬ 
vector  u: 

x  =  Ul  =  (u,  g)  <=  L 

Let  us  write  down  the  transformation  U  in  terms  of  components: 

xl  =  Zu%  (11) 

where  x'  are  the  components  of  vector  x  relative  to  any  basis 
eu  ....  en  of  space  L,  g;-  are  the  components  of  g  relative  to  the 
reciprocal  basis  e\  ....  en  of  space  L*.  The  matrix  of  transforma¬ 
tion  U  coincides  with  the  matrix  ||  uij  ||  of  the  bivector  u.  Therefore 
the  rank  of  U  (defined  as  the  rank  of  its  matrix)  coincides  with 
the  rank  of  the  bivector  u  (rank  U  —  r). 

Let  Lr  —  U (L*)  be  the  image  of  the  entire  space  L*.  By  Sec¬ 
tion  3,  Chapter  VII,  Lr  is  a  subspace  of  dimension  r  in  the  space  L. 
The  subspace  L,  will  he  called  the  rank  subspace  of  the  bivector  it. 

Suppose  that  the  basis  vectors  e\,  . . . ,  er  are  chosen  in  Lr.  Then 
they  form  a  basis  in  Lr  and  every  vector  x  —  U g  can  be  decom¬ 
posed  in  terms  of  the  vectors  eh  ...,  er.  In  other  words,  in  this 
case  we  have 

Xr+'  =  ...  —Xn  —  Q  (12) 

Equations  (12)  hold  true  for  arbitrary  geL*,  and  so  from  (11) 
and  (12)  we  have  u'>  =  0  if  i  >  r.  From  this,  u{>  =  0  if  /  ^  r 
since  u‘>  =  —u>'.  And  so  if  the  basis  <?i,  ....  en  is  such  that  eu  . . . 
....  er  e  Ln  then  the  matrix  of  the  components  of  the  bivector  u 
in  terms  of  the  basis  T20  (that  is,  the  matrix  of  coefficients  of 


BIVECTORS 


363 


§  3] 

the  expansion  (10))  is  of  the  form 


Ar 

0 

0 

0 

where  A,  denotes  the  r  X  r  square  suhruafrix,  all  other  elements  of 
the  matrix  (P)  being  zeros.  Clearly  D,  —  del  Ar  0,  otherwise 
the  rank  of  the  bivector  u  would  be  less  than  r. 

16.  Let  Lo  be  the  zero  subspace  of  the  bilinear  form  u( tp 
(Lo  is  of  dimension  n  —  r).  From  Subsection  15  it  follows  that 
if  TEL,  £eLo,  then 

(*,0  =  0  (Y) 

If  (y)  holds  for  any  geLo,  then  re/.,;  if  (y)  holds  for  any 
xeLr,  then  !eL(*.  Thus,  if  we  regard  a  contraction  as  the  analo¬ 
gue  of  a  scalar  product,  then  Lr  and  Ly  are  similar  to  the  subspace 
and  its  orthogonal  complement.  (Of  course  one  has  to  bear  in 
mind  that  Lr  and  Lillie  in  distinct  spaces.) 

To  see  the  truth  of  the  foregoing,  observe  that  L«  is  the  null 
space  of  the  transformation  U  and  that  if  the  vectors  eu  ....  eT 
are  chosen  in  Lr,  then  the  vectors  er+l,  . . . ,  en  of  the  reciprocal 
basis  will  lie  in  L',  (see  (11)  with  account  taken  of  (P))  and  will 
constitute  a  basis  there. 

17.  Theorem  1.  The  rank  of  every  bivector  is  an  even  number. 

Proof.  If  r  were  odd,  we  would  have  Dr  =  0,  since  every  skew- 

symmetric  determinant  of  odd  order  is  zero.  (This  is  evident  if  we 
multiply  by  ( — 1)  every  row  of  the  skew-symmetric  determinant 
and  then  take  the  transpose.)  . 

18.  Theorem  2.  Every  bivector  in  three-dimensional  space  is 
simple. 

Proof.  If  L  is  three-dimensional,  then  by  the  preceding  theorem, 
for  every  bivector  u  over  L  two  cases  are  possible:  r  =  0  and 
r  —  2.  In  the  former  case,  u  is  a  zero  bivector  and  hence  we  can 
write  u  —  a  A  a,  where  a  is  any  vector  in  L.  In  the  latter  case, 
the  rank  subspace  Lr  of  u  is  two-dimensional.  Therefore,  if  in  L 
we  take  the  basis  eu  e2,  e3  provided  eu  e2  <=  Lr,  then  the  expansion 
(10)  becomes 

ti  —  uuex  A  e2 


which  proves  the  theorem. 

19.  Theorem  3.  In  a  space  of  arbitrary  dimension,  every  nonzero 
bivector  u  may  be  represented  as  a  sum  of  simple  bivectors,  whose 


364 


MULTIVECTORS  AND  OUTER  FORMS 


[CH.  X 


number  is  equal  to  half  the  rank  r  of  u: 

u  =  p{  A  <7i  +  •  •  •  +  Pk  A  qk,  2k  —  r  (13) 

The  vectors  p\,  qi,  . . . ,  Ph,  Qk  are  linearly  independent. 

Proof.  First  of  all  note  that  if  we  have  an  expansion  like  (13), 
then  p\,  qi,  ....  ph,  qk  are  definitely  independent.  Indeed,  suppose 
these  vectors  are  dependent.  Then  their  linear  hull  L  has  dimen¬ 
sion  s  <  r  and  pu  q\,  ....  ph,  qk  can  be  expanded  in  terms  of  the 
basis  e\,  ...,  es^t.  Substituting  their  expansions  into  (13),  we 
get  for  u  an  expansion  like  (10)  relative  to  the  basis  e,j  in  7o 

over  L,  that  is,  for  i,  j  =  1,2 . s  <  r.  But  in  that  case  the  rank 

of  u  will  prove  to  be  less  than  r,  which  is  a  contradiction.  Also 
note  that  by  similar  reasoning  the  number  of  simple  bivectors  in  a 
sum  of  type  (13)  cannot  be  less  than  r/2. 

We  prove  the  possibility  of  expansion  (13)  by  induction.  It  is 
clear  that  for  every  bivector  u  of  rank  =  2  there  exists  an  expan¬ 
sion  of  type  (13),  namely,  u  =  p  A  q,  where  p,  q  are  independent 
vectors  lying  in  a  two-dimensional  rank  subspace.  Suppose  that 
the  possibility  of  expansion  (13)  has  been  established  for  all  bi¬ 
vectors  of  rank  2,4,  ....  r  —  2;  then  we  will  show  that  such  an 
expansion  is  also  possible  for  a  bivector  of  rank  r.  This  will  com¬ 
plete  the  proof. 

Let  the  basis  . . .  en  in  L  be  chosen  with  the  proviso  that 

e\,  . . . ,  er  e  Lr.  We  then  have 

u  =  ul2ei  A  e2  +  u,3e ,  A  e3  +  ...  -f  uwe ,  A  er 
+  u?e2  A  e3  -F  ...  -f  u:r e2  A  er 

+  . . .  +  ur~'rer- 1  A  er 

Put  e\  =  p\,  une2  -f  u,3e3  -f-  . . .  +  u,rer  =  qi.  From  this  and  from 
the  foregoing  expansion  we  find  that  the  rank  of  the  bivector 
u  —  pi  A  qi  does  not  exceed  the  number  of  vectors  in  the  set 

e2,  e3 . er,  that  is,  it  does  not  exceed  r  — -  1.  But  the  rank  of  every 

bivector  is  an  even  number.  Consequently,  the  rank  of  the  bivector 
a  —  Pi  A  </i  docs  not  exceed  r  —  2.  For  this  reason  and  by  the  in¬ 
duction  hypothesis  there  exist  vectors  p2,  q2,  ....  Ph,  qk,  where 
2k  ^  r  (that  is,  the  number  of  pairs  p(,  qt  does  not  exceed  half  the 
number  r  —  2) ,  such  that  u  —  pt  A  qy  =  p2  A  q2  +  . . .  +  ph  A  qh. 
From  this  we  obtain  the  expansion  (13).  Here  2k  =  r  since  in  rea¬ 
lity  2k  <  r  is  not  possible  due  to  the  remark  made  at  the  begin¬ 
ning  of  the  proof.  The  proof  is  complete. 

20.  If  a  Euclidean  metric  is  specified  in  an  n-dimensional  linear 
space  L,  we  can  assume  that  L*  coincides  with  L.  Here,  by  a  con¬ 
traction  (x,  |)  of  two  elements  x,  l  of  L  we  mean  their  scalar  pro¬ 
duct.  From  this  and  on  the  basis  of  the  reasoning  of  Subsection  16 


$  3] 


HI VECTORS 


365 


we  conclude  that  for  a  bivcctor  in  Euclidean  space  L,  l)olh  the  rank 
subspace  Lr  and  the  zero  subspace  L0  are  defined  in  L  itself  (we 
now  Understandably  write  /.0  without  an  asterisk).  The  subspaces 
L0  and  L,  are  orthogonal  complements  of  each  other. 

21.  Let  y  =  Ax  be  a  linear  transformation  specified  in  the  n-di- 
mensional  Euclidean  space  L.  In  component  notation  we  have 
yi  —  Ai,xs,  where  x*  are  contravariant  components  of  the  vec¬ 
tor  x  and  yi  are  covariant  components  of  the  vector  y\  here,  are 
covariant  components  of  a  tensor  of  the  given  linear  transforma¬ 
tion.  We  denote  this  tensor  by  A  and  the  transformation  by  y  = 
—  (A,  x).  The  parentheses  denote  a  right  contraction  of  tensor  A 
with  vector  x.  We  call  the  linear  transformation  skew  if  AiK  = 
=  — /4S1  (sec  Chapter  IX). 

The  definitions  and  theorems  given  above  for  contravariant  bi¬ 
vectors  carry  over  directly  to  covariant  bivectors  (skew  second- 
order  covariant  tensors).  In  the  case  of  a  skew  transformation, 
tensor  A  is  a  covariant  bivector,  the  rank  of  which  is  equal  to  the 
rank  of  that  transformation.  From  this  and  by  Subsection  19  we 
have  the  following  proposition. 

Let  y  =  Ax  be  a  skew  linear  transformation  in  Euclidean 
space  L\  if  its  rank  is  r,  then  in  L  there  will  be  independent  vec¬ 
tors  pi,  q\,  . . . ,  Pk,  Qh,  where  k  =  r/2,  such  that  the  given  trans¬ 
formation  can  be  represented  as 

y  =  (p\  Ap(,  x)  +  ...  +  (pk  A  qk>  x) 

Here,  (pt-  A  qu  x)  is  the  right  contraction  of  bivector  p,-  A  p,  with 
vector  x: 

(pi  A  Pr,  x)  —  pi  (qh  x)  —  p,  (p„  x) 

(Pi,  x)  and  (Pi,  x)  are  scalar  products. 

22.  In  particular,  if  L  is  three-dimensional  Euclidean  space, 
y  =  Ax  is  an  arbitrary  skew  transformation  of  nonzero  rank  speci¬ 
fied  in  L,  then  there  are  independent  vectors  p,  p  such  that  y  = 
=  (p  A  p,  x).  Setting  a  =  —  [p  X  p],  we  get  y  =  [a  X  *]•  Indeed, 
[a  X  x]  =  [x  X  (P  X  q\]  =  P(Q ,  x) —  p(p,  x)  =  (pAp,  x).  Thus,  in 
three-dimensional  Euclidean  space,  every  skew  transformation  can 
be  represented  in  the  form  of  a  vector  product  (including  a  trans¬ 
formation  of  rank  zero  when  a  —  0).  This  result  has  already  been 
established  in  Chapter  IX. 

23.  We  conclude  this  section  with  a  proposition  known  as  Car- 
tan’s  lemma. 


Jfifi  Ml'ET!  VECTORS  AND  OUTER  FORA\S  [CM.  X 

Given  in  a  linear  space  L  two  sets  of  vectors  pt . ph  and 

q\,  ... ,  qh,  the  set  pu  ....  ph  being  linearly  independent.  Let 

P\  A  qt  +  ...  +  Pk  A  <7*  =  0  (14) 

Then  the  vectors  q i,  ....  qh  are  expressed  linearly  in  terms  of 
Pi,  ....  pt,  by  the  relations 

qi  =  H^isPs  O5) 

with  the  k  X  k  symmetric  matrix  ||ats||,  that  is,  a,-,  =  asi.  Con¬ 


versely,  if  (15)  hold  and  the  matrix  ||  <x,-.s  II  is  symmetric,  then  we 
also  have  (14). 

Proof.  (1)  We  first  demonstate  the  converse.  Given  (15)  and 
a  is  —  a.,,-.  Then 

k  k 

Z  Pt  A  Z  a isPi  Ap,=  *Z (a(i  —  a5i) Pi  A  ps  =  0 

i= I  i.  s=  I 

Thus  (14)  holds  true. 

(2)  Now  let  (14)  be  given.  We  complete  the  set  of  vectors 

Pu  ....  ph  to  the  basis  p\,  . . . ,  Ph,  . . . p„  in  L.  Then  we  write 

the  expansion 

k 

qi=Y  <*tsPs  +  <*/  k+lPk+l  +  •  •  •  +  alnPn  (16) 

5=1 

From  this  and  (14)  we  have 

*  ft 

Z  Pi  A  <7i  =  *  Z  (ais  ~  asi)  Pi  A  ps 

i*=\  l,  5=*1 

k 

+  Z  («( k+\Pi  A  Pk+i  +  •••  +  a/np(- A  p„)  =  0  (17) 

i=l 

But  the  outer  products  p,  A  ps,  i  <  s,  form  a  basis  in  the  sub¬ 
space  of  biveclors  over  L.  For  this  reason  and  by  (17),  a,s  =  aSi 
if  i,  s—  1,2,  ....  k,  and,  besides,  als  =  0  for  s>k.  Thus,  rela¬ 
tions  (16)  reduce  to  relations  of  type  (15)  and  matrix  ais  is  sym¬ 
metric.  The  proof  is  complete. 

§  4.  Simple  multivectors 

1.  A  simple  contravariant  multivector  in  space  L  is  an  outer 
product  of  several  vectors  taken  in  L: 

p  =  a,  Aa2A  ...  A  ak 

The  numbei  h  is  called  the  order  of  the  multivector  p.  We  also 
say  that  p  is  a  /(--vector  in  L. 


SIMPLE  MULTIVECTORS 


367 


5  4) 

2.  We  now  present  a  number  of  propositions  concerning  simple 
multivectors  of  any  order.  They  are  a  natural  generalization  of  the 
propositions  dealing  with  simple  bivectors  given  in  Subsections  2 
to  9  of  Section  3.  The  proofs  of  these  propositions,  which  were  pre¬ 
sented  in  detail  for  the  special  case  of  k  —  2,  carry  over  in  trivial 
fashion  to  the  general  case. 

(1)  p  =  0  if  and  only  if  lift  vectors  a\,  . . . ,  an  are  linearly  de¬ 
pendent. 

(2)  Let  a\ . ah  be  independent;  hence  p  =/=  0.  Then  a1(  . . . ,  a* 

determine  a  ^-dimensional  linear  subspace  L/,  in  L,  namely  its 
linear  hull;  Lh  =  L(a\ ,  . . . ,  a*).  We  say  that  Lh  is  a  linear  sub¬ 
space  of  the  multivector  p  or  that  the  multivector  p  lies  in  Lh.  We 
also  say  that  p  is  a  direction  multivector  of  the  subspace  L*. 

Let  us  take  in  Lh  arbitrary  vectors  bu  ....  bh.  We  have 


b  i  =ana,  +  •  • 

■  ■  +a*a*,  | 

(1) 

bk  —  Ofti^i  +  •  ■ 

•  +  UkkUk  ' 

where  a,,-  are  numerical  coefficients.  Taking  advantage  of  the 
distributive  property  of  outer  multiplication  and  its  skew  sym¬ 
metry,  we  can  readily  prove  that 

q  =  b\/\b2/\  ...  A  6fe  =  det||ai/||(al  A  a2  A  ...  A  ak)  (2) 

Indeed,  in  the  termwise  multiplication  of  the  right  members  of 
(1)  we  only  have  to  take  into  account  those  terms  a,;,  . . . 
. . .  a kikail  A  •  ■  •  A  aik,  where  the  indices  m,  . . . ,  ih  are  all  distinct 
(all  other  terms  are  zero).  But  for  a  written  down  (arbitrary) 
term  we  have 

A  ...  A  a,k  =  6/,...  ikaUi  . . .  aUfta,  A  ...  A  ak 

Therefore 

b\  A  ...  A  ^  =  (2  6/, ...  ifcai/,  •  •  •  a*/ft)ai  A  ...  A  ak 
=  det  ||  ait  ||  (a,  A  ...  A  ak) 

Here  we  have  used  formula  (1)  from  Section  3,  Chapter  II 
(strictly  speaking,  this  formula  yields  the  determinant  of  the  trans¬ 
posed  matrix  ||  an  II,  since  the  summation  here  is  over  the  second 
indices  of  the  elements  a,;-  and  not  over  the  first  indices,  as  in  Sec¬ 
tion  3,  Chapter  II). 

Let  us  once  again  stress  that  the  factor  det  ||  a,j  ||  is  obtained 
due  to  the  linearity  of  the  product  in  each  argument  and  to  the 
skew  symmetry  of  the  product  in  each  pair  of  arguments.  In  the 
sequel  we  will  carry  out  similar  computations  without  going  into 
so  much  detail. 


368 


MULTIVECTORS  AND  OUTER  FORMS 


[CH.  X 


Thus,  q  —  Dp ,  that  is,  the  multivector  q  is  proportional  to  the 
multivector  p  and  the  proportionality  factor  D  =  det||a,vl|. 

Conversely,  if  q  —  b\  A  b2  A  . . .  A  bh  =  Xp,  where  X  is  a  non¬ 
zero  scalar,  then  b\,  . . . ,  bk  are  independent  and  lie  in  Lh\  that  is 
to  say,  equations  (1)  hold,  and  X  —  D  =  det  ||  an  ||. 

(3)  In  particular 

A62A  ...  A  bk  —  a{  A  a2  A  ...  A  ak 

when  and  only  when  bu  ....  bn^Ln  —  L(au  ....  ah)  and 
det  ||  a,j  II  =  1 . 


3.  Suppose  a  linear  space  L  is  equipped  with  a  Euclidean  metric. 
Let  b\,  ....  bh  e  Lh  =  L(a.\ . ah).  We  imagine  a  nonzero  mul¬ 

tivector  b\  Ab2  A  . . .  Abk  in  the  form  of  an  oriented  A-dimen- 
sional  parallelepiped  in  Lh  constructed  on  an  ordered  set  of  vec¬ 
tors  bu  b2,  ...,  bk.  The  volume  (^-dimensional)  of  this  parallele¬ 
piped  will  be  called  the  volume  of  the  multivector  bx  A  b2  A  . .  .  A 
A  bh .  We  will  use  the  term  unit  multivector  for  one  whose  volume 
is  equal  to  unity.  The  volume  of  an  arbitrary  simple  multivector  is 
computed  in  accordance  with  Subsection  6,  Section  13,  Chap¬ 
ter  VIII. 


Suppose  the  original  multivector  a\Aa2A...Aah  is  a  unit 
multivector  and  that  the  subspace  Lh  =  L(alt  . . . ,  ah)  is  oriented 

by  an  ordered  £-tuple  of  vectors  a . .  Then  det  ||  ai}  ||  =  V, 

where  V  is  the  ordered  ^-dimensional  volume  of  the  multivector 
b,  A  b2  A  . .  .  A  bn.  Equation  (2)  can  now  be  written  as 

biAb2A  ...  A  bk  —  V  •  a,  A  a2  A  ...  A  ak  (3) 

4.  Thus,  simple  multivectors  of  order  k  lying  in  Lh  are  depicted 
as  oriented  parallelepipeds  of  the  subspace  Lh.  By  (3),  parallele¬ 
pipeds  of  equal  volume  and  the  same  orientation  in  Lh  represent 
one  and  the  same  multivector  (for  k  =  3  see  Fig.  72). 

5.  In  1.  take  an  orthonormal  basis  eu  ....  en.  Let  us  consider 

the  set  of  arbitrary  vectors  b\ . iiteL  We  have 

/;i  =  £  *\ei . **  =  £  xiei 


(4) 


SIMPLE  MULTI  VECTORS 


369 


5  4] 


We  choose  some  basis  vectors  e elk ,  provided  /,  <i2<...  </4. 
They  determine  a  ^-dimensional  coordinate  plane,  which  we  de¬ 
note  by  £,,...  (to  be  more  precise,  Eil...ik  is  the  linear  hull 
of  the  vectors  etj,  . . .,  eik).  We  will  assume  that  the  plane  Eit ...  / 
is  oriented  by  the  multivector  ei{  A  ein  A  ■ . .  A  e,fe.  Consider  any 

vector  b/  of  the  fe-tuple  bu  bk.  Denote  by  b /  the  projection 
of  bj  on  the  plane  E^...  ik: 

bl  =  xll'e.i  +  xl’ei2  +  ...  +  xl*e,k 

We  say  that  the  multiveclor  b,  A  b2  A  ...  A  bk  is  the  projection 
of  the  multivector  6,  A  b2  A  ...  A  bk  on  the  ft-dimensional 
plane  By  (3)  we  have 

btAb2A  •••  Abk  =  Vl't2'"lkell  A  e,2  A  ...  Ae,k 

Here,  Vl>it"'lk  is  the  oriented  ^-dimensional  volume  of  a  paral¬ 
lelepiped  constructed  in  Ei, ...  <fe  on  the  vectors  B i,  . . . ,  bk.  At  the 
same  time,  for  the  multivector  b\  A  b2  A  . . .  A  bh  itself,  we  find, 
via  the  rules  of  outer  multiplication, 

biAb2A  ...  A  =  Ae,2  A  •••  A  eik  (5) 


where,  as  usual,  the  starred  sigma  denotes  summation  with  the 
proviso  that  ix  <  i2  <  . . .  <  4- 

From  (5)  we  conclude  that,  relative  to  an  orthonormal  basis, 
the  components  of  the  simple  multivector  bx  A  b2  A  . . .  A  bk  are 


r  I 


ylil2- 1 


k  — 


ancs 

Ei,. 

V 

x‘l 

•*r 

X2 

x‘k 

•  ■  •  xi 

*2 

Xl2 

X2 

...  x[k 

■h' 

X*2 

xk 

Xlk 

•  •  •  xk 

(6) 


6.  In  Subsection  5  we  discussed  projections  and  volumes.  How¬ 
ever,  the  algebraic  manipulations  that  resulted  in  formulas  (5) 
and  (6)  do  not  depend  on  the  metric.  For  this  reason,  if  the  vec¬ 
tors  (4)  are  specified  in  a  linear  space  L  relative  to  some  basis 
eu  . . . ,  en ,  then  formula  (5)  with  coefficients  (6)  holds  true  for  the 
multivector  b\  A  ...  A  bu. 


370 


MULTiVECTORS  AND  OUTER  FORMS 


ICH.  X 


§  5.  Vector  product 


1.  Using  simple  multivectors,  it  is  possible  in  a  natural  way  to 
extend  the  notion  of  a  vector  product  to  the  case  of  any  number  k 
of  vector  factors  in  Euclidean  space  En  of  any  dimension  n(n  >  k). 

First  recall  that  in  £3  the  components  of  a  vector  product  z  = 
=  [*  X  y\  >n  an  arbitrary  basis  are  expressed  as 
i  v  'V  op  V  ■■  I  “  P 

2  =  £g  EaPv*  y  =L  Bap.-C  If 

Here,  gi}  is  the  contravariant  metric  tensor,  eapY  is  the  covariant 
discriminant  tensor,  and  eapV  are  the  mixed  components  of  the 
discriminant  tensor  which  are  obtained  by  raising  the  last  index 
(see  Sections  9  and  13,  Chapter  VIII). 

Now  let  jci,  . . . ,  xh  be  k  <  n  vectors  given  in  En  and  let  {*(} 
be  the  components  of  the  vector  x,  in  an  arbitrary  basis  et,  . . . ,  en 
of  En-  We  will  construct  a  tensor  z  via  the  following  formula: 


z  1 


..  i 


n-k 


ln—k 


=  Z\...ak. 

==  X  £f'lPl  •  •  •  g'"-fcf5'*-*ec 


xa* 


jcal 
-k  ' 


(1) 


Here,  as  usual,  e,  ,  is  the  discriminant  tensor  and  gi}  is  the 

metric  tensor  of  En.  The  discriminant  tensor  is  axial  and  has 
skew  symmetry  with  respect  to  all  indices.  Therefore  z  is  an  axial 
multivector,  that  is,  an  axial  skew-symmetric  tensor  (of  order 
n  —  k).  According  to  the  general  rules  (see  Section  9,  Chap¬ 
ter  VIII),  we  can  write  it  in  terms  of  covariant  components.  We 
obtain  a  simpler  formula: 


2  =  T  e 

'■  -‘n-k  “l-Vl  -‘n- 


xy 


(2) 


The  multivector  z  of  order  n  —  k  defined  by  (2)  is  called  the 
vector  product  of  the  vectors  xu  ...,  x,t  taken  in  that  order  and 
we  will  denote  it  by  [jti  X  •  ■  •  X  */J- 
In  other  words, 

•••  x*,]-!*'--'— ... o) 


We  will  now  show  that  the  multivector  z  is  a  multidimensional 
analogue  of  the  ordinary  vector  product. 


2.  The  properties  of  a  vector  product  (the  proof  is  given  in  Sub¬ 
section  3  below7)  are  as  follows. 

(1)  A  vector  product  is  linear  in  each  factor.  For  example,  for 
the  first  factor  we  have 


[(<**;  +  K) x a-,, x  ...  x**] 

-  o [.v;  x  a-2 x  ...  x  vfc] + p[<  x  .v2  x  ■ . .  x  **] 


VECTOR  PRODUCT 


371 


!>  5) 


(2)  A  vector  product  is  skew-symmetric  in  any  pair  of  factors. 
For  instance,  for  the  first  pair  of  factors  we  have 

[*I  X  *2  X  •  •  •  X^]  =  -[jc?Xj:iX  •  •  •  X  **] 

(3)  [x\  X  •  ■  •  X  A]  =  0  if  and  only  if  the  factors  are  linearly 

dependent.  • 

(4)  A  vector  product  is  a  simple  multivector.  To  be  more  precise, 
there  exist  vectors  iju  ....  yn-i,  e  £„  such  that 

[*l  X  X2  X  ...  X  xk]  —  ;/l  A  //•.  A  •••  A  l/n-k 

(5)  If  the  vectors  xu  . .  . ,  xh  are  linearly  independent,  then  the 

vectors  yit  ...,  are  also  linearly  independent.  Thus  the  sub¬ 
spaces  Lk  =  L(x . .  xh)  and  £„_*==  £(#,,  ■■■,  l /„-/.)  are  fully 

defined.  These  subspaces  Lu  and  are  orthogonal  complements 
of  one  another. 

(6)  If  the  vectors  X\,  ....  */t  are  linearly  independent,  then  the 

orientations  of  Xi,...,  Xh  in  Lh  and  . . .  yn-i,  in  T„_/(  arc  such 

that  the  combined  set  of  vectors  x\,...,xh,  y t, ....  yn^h  has  the 
same  orientation  as  the  basis  e\,  . . . ,  en  (that  is,  as  that  basis  in 
En  relative  to  which  the  components  are  specified  in  the  original 
equations  (1)). 

(7)  The  (n  —  k) -dimensional  volume  of  the  multivector  z  — 
—  yi  A  t/2  A  . . .  A  y„-h  is  numerically  equal  to  the  ^-dimensional 
volume  of  a  parallelepiped  constructed  on  the  vectors  xlt  . . . ,  x /,. 

3.  Proof.  Properties  (1)  and  (2)  follow  immediately  from  equa¬ 
tion  (2).  Now  let  us  prove  property  (3).  From  properties  (1)  and 
(2)  it  follows  that  z  =  0  if  *i,  . . . ,  xh  are  linearly  dependent. 

Now  suppose  x\,  ...,  xh  are  linearly  independent.  In  the  linear 

hull  Lh  —  L(x i,  . . . ,  Xh)  we  choose  an  orthonormal  basis  a\ . a* 

with  orientation  the  same  as  That  of  the  set  of  vectors  xi,  ....  xk, 
and  complete  it  to  the  orthonormal  basis  oi,  ....  ah,  ... ,  a„  in  the 
entire  space  £„,  which  has  the  same  orientation  as  the  original 
basis  e\,  . . . ,  en.  We  have  an  expansion  of  the  form 

xt  =  c^a,  +  •  •  •  +  o.ikak 


and  det  ||  a{j  f|  =  F,  where  V  >  0  is  the  ^-dimensional  volume  of 

a  parallelepiped  constructed  in  Lh  on  the  vectors  . xk.  Due 

to  the  linearity  and  skew  symmetry  of  a  vector  product,  we  get 

[*iX*2X  •••  X  **]  ==  det  ||  an  ||  [«|  X  «2  X  •••  XOfc] 

=  F  [a,  X  «2  X  •••  X«*] 


whence  and  also  from  (1) 

zll  "•  ln-k  =  F  ^  g  I15'  ...  g'n-ft(in-fteu 


a*pl  •••  ffl-t 


a  k 
uk 


I1*) 


372 


MUt.TI VECTORS  AND  OUTER  FORMS 


|CH.  X 


The  tensor  equation  (4)  holds  true  in  any  basis  with  orientation 

the  same  as  that  of  e, . e„.  In  particular,  in  the  orthonormal 

basis  ai . a„  we  have  =  6'P,  a?  =  6?,  so  that  in  this  basis 


whence  it  follows  that  the  multivector  z  has,  for  j'i  <  is  <  •  •  •  < 
<  in—kt  a  unique  nonzero  component,  namely, 

© 

From  (5)  it  is  evident  that  in  this  case  the  linearly  independent 
factors  [xi  X  ■  ■  ■  X  *kl  ¥=  0.  This  proves  property  (3). 

We  now  prove  properties  (4)  to  (7)  of  Subsection  2.  Note  that 
there  cannot  be  more  than  one  simple  multivector  with  properties 
(5)  to  (7)  (these  properties  specify  a  subspace  and  also  the  volume 
and  orientation  of  a  multivector).  We  construct  a  simple  multivec¬ 
tor  u  equal  to  the  vector  product  z  and  possessing  properties  (5) 
to  (7).  Set 

u  —  (Vak+l)  A  ak+2  A  ...  A  an 
From  (5),  Section  4,  we  find  that 

uk+\  ...n  =  v  (6) 

and  that  the  remaining  components  u‘ 1  "•'«-*  are  zero,  provided 
1 1  <  12  <  ...  <  Comparing  (5)  and  (6)  we  get 

u  =  z 

Therefore,  setting 

y\  =  Vak+i,  y2  —  flfc+2.  •••>  yn-k  —  an 
we  see  that  all  the  properties  (4)  to  (7)  of  Subsection  2  hold  true. 

4.  The  notion  of  a  vector  product  helps  in  an  arbitrary  (in  par¬ 
ticular,  an  oblique-angled)  basis  to  solve  the  following  geometric 
problems. 

(a)  Given  in  a  Euclidean  (vector)  space  En  a  subspace  Lk.  Find 
its  orthogonal  complement  Ln_;t. 

(b)  Given  in  a  Euclidean  (point)  space  En  a  point  A  and 
a  /j-dimensional  plane  Pk  that  does  not  pass  through  A.  Pass 
through  A  a  plane  of  dimension  n  —  k  orthogonal  to  Ph. 

(c)  Under  the  hypothesis  of  (b),  drop  from  point  A  a  perpen¬ 
dicular  on  Pi,  (that  is,  find  a  straight  line  orthogonal  to  Ph  pass¬ 
ing  through  A  and  intersecting  Pk). 

(d)  Under  the  hypothesis  of  (b),  find  the  shortest  distance  from 
poiut  A  to  plane 


VF.C.TOR  PRODUCT 


373 


$  51 

(e)  Given  in  space  En  the  skew  planes  Ph  and  Pt.  Construct 
their  common  perpendicular. 

(f)  Under  the  hypothesis  of  (e),  find  the  shortest  distance 
between  Ph  and  Pi . 

Without  going  into  details,  ^e  consider  the  outlines  of  the  solu¬ 
tions  of  the  foregoing  problems. 

To  solve  (a)  it  suffices  to  take  a  basis  a-|,  .  . . ,  xtl  in  Lh  and  con¬ 
struct  the  subspace  L„  i,  of  the  multivector  [x\  X  . . .  X  xh].  The  re¬ 
quired  computations  are  given  below  in  Subsections  5  and  6. 

Problem  (b)  clearly  reduces  to  (a). 


Let  Pn-h  be  a  plane  obtained  in  the  solution  of  problem  (b).  It 
is  easy  to  establish  (say  by  Theorem  5,  Section  7,  Chapter  III) 
that  Pn-h  intersects  Ph  and  to  prove  that  their  point  of  intersec¬ 
tion  is  unique.  Denote  that  po'int  by  B.  Then  the  straight  line  AB 
will  be  the  desired  perpendicular  in  problem  (c) ,  and  the  length  of 
the  segment  AB  will  be  the  desired  distance  in  problem  (d). 

The  solution  of  problem  (e)  constitutes  a  multidimensional  ge¬ 
neralization  of  the  construction  of  a  common  perpendicular  to  two 
skew  lines  in  £3. 

Namely,  let  us  consider  the  direction  subspaces  Lh  and  of 
planes  Ph  and  Pt  respectively.  Construct  the  sum  L'  =  Lh  +  U  and 
its  orthogonal  complement  L.  Finding  the  latter  reduces  to  solving 
(a).  Let  the  intersection  L/t  fl  Lt  have  dimension  in.  Then  the  di¬ 
mension  of  C  is  p  —  n  —  (k  +  l  —  m) . 

Passing  plane  P  through  an  arbitrary  point  M  of  plane  Ph  in 
the  direction  of  subspace  L  =  Lh  0  L  (Fig.  73).  P  has  dimension 
k  -f-  p  =  n  +  m  —  /.  By  virtue  of  the  construction  of  L,  the  inter¬ 
section  £  fl  L,  coincides  with  the  intersection  Lh  fl  L;  and  there¬ 
fore  has  dimension  m.  Using  Theorem  5,  Section  7,  Chapter  III, 


374 


MULT! VECTORS  AND  OUTER  FORMS 


[CH.  X 


we  find  that  the  plane  P  intersects  the  plane  Pi.  Let  C  be  a  point 
of  their  intersection.  It  can  be  shown  that  for  m  ^  1  the  point  C 
is  not  unique,  in  contrast  to  the  familiar  three-dimensional  case. 

Through  point  C  draw  plane  P  in  the  direction  of  the  sub¬ 
space  L.  Note  that  P  depends  on  the  choice  of  C  in  the  plane 
Pi  f|  P  (see  Fig.  74,  where  n  =  4,  Ph  —  Lh  =  L(eit  e2),  Pt  passes 
in  the  direction  of  l.(e2,e 3),  P  is  one-dimensional).  Regarding  Ph 
and  P  as  planes  in  a  (k  -\-  p)-dimensionaI  affine  space  P  and  once 
again  applying  Theorem  5  of  Section  7,  Chapter  Ill,  we  find  that 
Pi,  and  P  intersect  in  a  point  D  (Figs.  73,  74).  The  straight  line  CD 
intersects  Pi,  and  Pt.  It  is  contained  in  P  so  that  its  direction  vec¬ 
tor  belongs  to  C  and  therefore  is  orthogonal  to  all  vectors  in  Lh 
and  Lh  Thus,  the  straight  line  CD  is  the  desired  common  perpen¬ 
dicular  (generally  not  unique)  to  P*  and  Pi.  We  leave  it  to  the 
reader  to  prove  that  the  length  of  CD  is  the  shortest  distance 
between  Ph  and  Pi,  so  that  the  solution  of  problem  (f)  is  also  ob¬ 
tained. 

Note  that  the  finding  of  subspaces,  planes  and  points  mentioned 
above  reduces  to  the  solution  of  certain  systems  of  linear  equa¬ 
tions. 

5.  We  return  to  problem  (a).  Suppose  that  Lh  is  specified  as  the 
linear  hull  of  independent  vectors  X\,  . . . ,  xh  expanded  in  terms  of 
a  basis  e\,  . . . ,  en: 

Xi  =  YjX[et 

By  Subsections  1  and  2  we  have 

2  =  [x,X  •••  X  x^  —  >Ji  A  ...  A  t yn-*  =  Zz''"‘ ...  elnk 

where  z‘'  ''n-k  are  determined  from  (1)  and  the  vectors  y\,  ... 
...,  yn-h  constitute  a  basis  in  the  desired  subspace  Ln_h.  Write 
down  the  expansion  of  the  vectors  y\,  . . . ,  yn-h  in  terms  of  the 
basis  eu  ... ,  en: 

\h  =  Z  ll{e, 

with  the  unknown  coefficients  {y[}  and  construct  the  matrix 


U  •• 

•  yrk 

\  ...  yn\ 

|  y'n-k  •  • 

.  un~i 

y  n  —  k 

|  • • •  v%-k 

As  in  Subsection  5,  Section  4,  we  can  prove  that  each  of  the  com¬ 
ponents  '«-*  of  the  multivector  z  is  equal  to  a  determinant  of 
order  n  —  k  formed  by  the  columns  of  matrix  Y  with  number  labels 


§51 


VECTOR  PRODUCT 


375 


i i.  •••,  in-h ■  Therefore,  although  we  do  not  know  the  numerical 
values  of  the  elements  of  Y,  we  do  know  the  numerical  values  of 
all  its  minors  of  order  (n  —  k).  For  the  sake  of  simplicity,  suppose 
the  left  (indicated)  minor  of  matrix  (7)  is  a  basis  minor,  that  is, 
that 

A  —  zl  ...  n-A-^Q 

The  vector  u  =  2  u‘et  belongs  to  Ln-k  if  and  only  if 


M1 

. . .  un~k  . 

.  .  Un 

y\ 

•  •  •  yrk  |  • 

■  ■  yns 

=  n  —  k 

(8) 

y'n-k 

nn-k  j 

•  • •  nn-k  j  • 

••  ynn-k 

We  now  construct  a  minor  of  order  (n  —  k-\-  1)  of  matrix  (8),  the 
minor  being  formed  by  columns  labelled  1,  ...,  n  —  kx  /,  where 
n  —  k  <  1  ^  n.  Such  a  minor  is  definitely  equal  to  zero.  Expand 
it  by  the  first  row  to  get  a  linear  equation  for  the  components  of 
vector  m;  putting  /  =  n —  &  +  1 . n,  we  obtain  the  homoge¬ 

neous  system  of  equations 

Au  =  0  (9) 

which  has  a  k  X  «  matrix  of  the  form 


k 


An  important  point  is  that  all  nonzero  elements  of  A  are  certain 
components  z'‘"'"-k  of  the  multivector  z.  Any  vector  u  e  L„_ft 
satisfies  the  system  (9).  But  rank  A—k  and  so  the  vectors  of 
Ln-h  form  the  entire  set  of  solutions  of  the  system  (9).  Thus,  (9) 
determines  the  desired  subspace  and  so  yields  a  solution  to  pro¬ 
blem  (a). 

Of  course  problems  (a)  to  (f)  can  be  solved  differently,  say  by 
using  Section  4  of  Chapter  VIII. 

6.  Now  let  e\,  .  . . ,  en  be  an  orthonormal  basis  in  E„.  We  then 
have  the  following  equations: 

[e,,Xe/,X  ...  X %]  =  <?■/,  Aeh  A  ...  A  %_ft 


(10) 


376 


MULTIVECTORS  AND  OUTER  FORMS 


|CII.  X 


where  the  combined  collection  of  indices  (i'i,  ....  ih,  j\ . /„-*)  is 

an  even  permutation  of  the  natural  sequence  (1,  2,  . . . ,  n).  To  see 
the  truth  of  equations  (10),  it  suffices  to  note  that  the  simple  mul¬ 
tivector  elt  A  eh  A  •••  A  ein_k  possesses  properties  (4)  to  (7)  of 
Subsection  2  relative  to  the  vector  product  \elx  X  e,2  X  ■  • .  X  eik\- 
But  only  one  multivector  can  have  these  properties. 

Formulas  (10)  provide  a  multiplication  table  of  basis  vectors 
elt  . . . ,  en  and  permit  computing  the  vector  product  of  any  vectors 
via  a  termwise  multiplication  of  their  expansions  in  terms  of  the 
basis  e\,  . . . ,  en.  With  their  aid  it  is  sometimes  possible,  without 
solving  (9),  to  find  the  factors  y\,  ...,  that  enter  directly  into 
the  expression  z  =  y\  A  . . .  A  yn-i<-  This  also  provides  an  alter¬ 
native  solution  of  problem  (a). 

Example.  Given  in  a  four-dimensional  Euclidean  space,  in  the 
orthonormal  basis  e\,  e2,  e3,  e,u  the  vectors  xx  =  2e,  +  e2  +  e3, 
x2  =  ei  -f-  e4.  Find  the  direction  bivector  of  the  orthogonal  comple¬ 
ment  of  the  linear  hull  L(x i,  x2). 

Solution.  For  the  desired  bivector  we  can  take  the  vector  pro¬ 
duct  [X|  X  *2]-  Multiplying  the  vectors  x\,  x2  termwise  and  using 
formulas  (10),  we  find 

U,  X  *2]  =  [(2«i  +  ^2  +  e3)  X  ( e,  +  e4)] 

=  \e2  X  *,]  +  [e3  X  e,]  +  2  [e,  X  <?41  +  [e2  X  e4]  +  [e3  X  e4] 

=  e.,  A  t>3  +  e2  A<4  +  2<?2  A  e3  +  e3  A  +  e,  A  e2 
—  e,Ae2  —  2e3  A  e2  —  eA  A  e2  —  e,  A  e3  +  2e3  A  e3  +  e4  A  e3 
=  ( e ,  —  2e3  —  e4)  A  (e2  —  e3 ) 

Thus  for  L(xi,  x2)  the  orthogonal  complement  is  L(y\,  y2),  where 
yi  —  e\  —  2^3  —  eu  y2  =  e2  —  e3. 

§  6.  Outer  forms  and  operations  on  them 

1.  Let  L  lie  an  n-dimensiona!  linear  space,  ,V|,  xu  a  set  of 
arbitrary  vectors  in  L,  co(X| . xi,)  a  multilinear  form  with  vec¬ 
tor  arguments  X\ . xh- 

We  use  the  term  alternation  of  the  form  ©(jci,  . . . ,  xh)  for  a  mul¬ 
tilinear  form  denoted  by  (ci)(*i,  . . . ,  xh))  and  defined  by 

(<*>(*1.  •••>  Xh)) -  y* ^1  v . . . k°>(xii’  xi2 . x (i) 

where  the  summation  over  each  of  the  indices  j\ . /,<  runs  from  1 

to  k.  In  particular,  for  a  linear  form  co(*)  we  have 

(to  (x))  =  to  (.v) 


(2) 


OUTER  FORMS  AND  OPERATIONS  ON  THEM 


§  61 


VI 


For  a  bilinear  form  <.» ( /?,  //)  we  have 

(to  (a,  ij ))  =  jf  (to  (a,  y)  to  (y,  .v)) 

For  a  form  of  three  arguments  we  have 
(to  (a,  y,  z)>  =  -gj-  (to  (a,  y,  z)  +  to  (;/,  z,  x) 

+  to  (z,  X,  y)  —  to  (//,  A,  z)  —  to  (a,  z,  y)  —  to  (z,  //,  a)) 
From  formula  (1)  it  is  clear  that 

(1)  (to)  is  indeed  a  multilinear  form,  that  is  to  say,  it  possesses 
linearity  in  each  argument;  for  example,  in  the  first  argument: 

(to(ctA{  +  PA",  A2,  ...,  Aft)) 

=  tx(co(Ap  A2,  ...,  Aft))  4- P  (to  (a(',  A2,  ...,  Aft)) 

(2)  (to (ai,  ....  xh))  has  skew  symmetry  in  any  pair  of  argu¬ 
ments;  for  instance,  in  the  first  pair: 

(to (a,,  x2,  a3 . a4)>  =  — (to (a2,  a,,  a3,  A*)) 


2.  The  form  to  will  be  termed  skew  if  (to)  =  to. 

By  equation  (2)  every  linear  form,  that  is,  a  form  in  one  argu¬ 
ment,  must  be  skew. 

From  the  definition  of  an  alternation  it  follows  that  a  skew  form 
has  skew  symmetry  in  every  pair  of  arguments. 

Conversely,  if  a  multilinear  form  to  has  skew  symmetry  in  any 
pair  of  its  arguments,  then  it  is  skew  in  the  sense  of  the  foregoing 
definition,  or  (to)  =  to.  The  proof  is  left  to  the  reader. 

3.  Let  to  (aj,  . . . ,  A*)  be  a  skew  form.  If  the  vectors  Aj,  ....  a* 
are  linearly  dependent,  then  the  corresponding  value  of  the  form  to 
is  zero. 

The  assertion  follows  from  te  properties  of  linearity  in  each  ar¬ 
gument  and  skew  symmetry  in  each  pair  of  arguments. 

4.  If  the  number  of  arguments  of  a  skew  form  to  exceeds  the  di¬ 
mension  of  the  space  (&  >  n),  then  to  is  identically  zero. 

This  assertion  is  a  consequence  of  Subsection  3. 

5.  The  important  thing  to  grasp  is  that  the  skew  form 
w(xi,  . . . ,  A*)  is  a  function  of  a  simple  multivector  p  =  Ai  A ....  A 
A  A*.  In  other  words,  the  value  of  w  remains  unaltered  if  the  vec¬ 
tors  Ai,  ....  A/,  are  replaced  by  other  vectors  yu  ....  «/*,  provided 
that 


y i  A  ...  A  yk  =  Aj  A  ...  A  A* 


378 


MULTIVECTORS  AND  OUTER  FORMS 


[CH.  X 


Let  us  now  prove  this.  Due  to  Subsections  3,  4,  it  suffices  to  con¬ 
sider  the  case  where  x\,  . . . ,  jc*  are  independent.  Then  the  equation 
yi  A  . . .  A  yu  —  x\  A  . . .  A  xh  is  equivalent  to  the  fact  that  we 
have  the  expansions 


D\  =<*11*1  +  . 

■  +  ®ikXk<  1 

«/*  =  <**!*!  +  ■ 

•  +  nkkxk  ) 

where  det  ||  a,j  II  =  1  (see  Subsection  2,  Section  4).  From  equa¬ 
tions  (3)  and  also  from  the  properties  of  linearity  and  skew  sym¬ 
metry  of  the  form  ©  we  have 

•••.  ijk)  =  det  ||  an  ||  ©(*,,  ...,  **)  =  ©(*„ 
which  proves  our  assertion. 

6.  By  Subsection  5,  the  domain  of  definition  of  a  skew  form 
<i)(jti,  ...,  Xu)  =  (o(p )  is  the  set  of  all  simple  multivectors  of  or¬ 
der  k. 

Definition.  A  skew  form  as  a  function  of  a  multivector  is  termed 
an  outer  form.  The  order  k  of  the  argument  p  is  conventionally 
called  the  degree  of  the  outer  form  ©(p). 

We  also  say  that  c o(p)  is  a  ft-form,  which  is  denoted  frequently 
by  wft(p). 

7.  Outer  forms  of  a  given  degree  k  constitute  a  linear  subspace 
in  the  space  of  all  multilinear  forms  with  arguments  xu  ....  Xh ■ 
True  enough  since  outer  forms  of  one  and  the  same  degree  can  be 
added  and  multiplied  by  a  scalar  to  yield  outer  forms  of  the  same 
degree  (because  a  linear  combination  of  skew  forms  in  x\,  xh 
is  a  skew  form  in  the  same  arguments,  which  is  obvious). 

8.  We  now  define  an  outer  product  of  two  outer  forms  of  ar¬ 
bitrary  degree.  The  outer  product  of  the  form  ©A(p),  p  —  x i  A 
A  ...  A  Xi,  multiplied  by  the  form  ©'(<7),  q  —  (xh  +  \  A  . . .  A  Xk+i)  is 
an  outer  form  of  degree  k  -+-  /  denoted  by  ©fc(p)  A© l(q)  and  de¬ 
fined  by 

©* (p)  A  a1  (q)  —  ■^yTp- (©* (*,»  •••.  **)©'(**+,>  **+/)> 

where  (w'‘(*i,  ...,  xh)iol(xh+i . **+/))  is  the  alternation  of  a 

multilinear  form  of  degree  k  +  l  obtained  by  ordinary  multiplica¬ 
tion  of  o)'1  by  <!)'. 

We  now  prove  that  ©ft(p)  A  ©'(<7)  is  an  outer  form  of  degree 
k  +  l  of  the  multivector  p  A  q: 

w*  (p)  A  ©'  (<7)  =  ©*+i  (p  A  q) 


§  7]  OUTER  FORMS  AND  COVARIANT  MULTIVECTORS  379 

Proof.  The  fact  that  cd''  (p)  A  (o'(<7)  is  a  skew  form  of  degree 
k  +  l  follows  directly  from  the  definition  of  an  alternation.  Then 
we  have 

a>k(p)  A  (ol(q)  =  uk+l(x . .  , ,)  =  <oft+z  (.v,  A  ...  Axk+l) 

But  by  Subsections  10,  11,  Section  2, 

*iA  ...  A^A  xk+i  A  ...  A  xk+i 

—  (*i  A  ...  A  xk)  A  (•<■*  +  ,  A  ...  A  xk+i)  =  p  Aq 
which  is  what  we  sought. 

9.  We  now  prove  that  an  outer  product  of  outer  forms  has  the 
following  properties. 

(1)  (a©*)  A  ©z  —  ©*  A  (a©z)  =  a (a>k  A  ©0  for  any  scalar  a  and 
any  outer  forms  ©fc  and  ©'. 

(2)  (©*  +  ©*)  A  ©z  —  ©f  A  ©z  -f  <d£  A  for  any  outer  forms 
(0*,  coj,  cd'. 

(3)  The  outer  multiplication  of  outer  forms  is  skew-commuta¬ 
tive,  namely, 

ak  A  (ol  —  ( — 1)<'/  a1  A 

for  any  coA,  w',  whence  it  follows,  in  particular,  that  for  o'  =  ioh 
we  have  iok  A  cok  —  0  for  any  odd  k. 

(4)  The  outer  multiplication  of  outer  forms  is  associative: 

(cofe  A  ©0  A  =  (ok  A  (a1  A  ©’") 
for  any  a>‘,  com. 

10.  Thus,  for  outer  forms  we  have  a  complete  analogy  with 
Grassmann  algebra  (see  footnote  on  page  352)  which  we  defined 
for  contravariant  multivectors,*  that  is  to  say,  multivectors  in  L. 

However,  we  will  demonstrate  in  the  next  section  that  this  is 
no  mere  analogy  but  is  precisely  a  Grassmann  algebra;  true,  it  is 
a  Grassmann  algebra  of  covariant  multivectors,  that  is  to  say, 
multivectors  in  the  space  L*,  which  is  conjugate  to  the  given 
space  L. 

In  this  way  the  properties  enumerated  in  Subsection  9  will  be 
proved. 

§  7.  Outer  forms  and  covariant  multivectors 

1.  Given  a  multilinear  form 

°>(x . .  xk)=Z  ,x\'  •••  x‘k  (1) 

where  x f  are  the  components  of  a  vector  Xj  relative  to  a  basis 
ei,  ....  en  in  L.  We  know  that  with  each  such  form  there  is  asso- 


380 


MULTIVECTORS  AND  OUTER  FORMS 


[CM.  X 


dated  invariantly  and  one-to-one  a  tensor: 

to  =  £  ...  ike‘x  ■  ■  ■  e‘k 


(2) 


where  e\  . . . ,  en  is  the  basis  in  L*  reciprocal  to  the  basis  e\,  . . . ,  e„ 
in  L  (see  Section  7,  Chapter  V). 

The  numerical  value  of  the  multilinear  form  (1)  on  specifically 
taken  vectors  x\,  . . . ,  Xu  is  a  complete  contraction  of  tensor  (2) 
with  these  vectors,  vector  X\  being  contracted  with  e‘\  vector  x2 
with  elt,  and  so  on. 

Now  recall  that  every  vector  of  the  conjugate  space  L*  is  a  li¬ 
near  form  in  a  vector  argument  in  L.  Besides,  the  value  of  this 
linear  form  on  the  given  vector  x  of  L  is  precisely  the  contraction 
of  x  with  that  vector  of  L*  which  represents  the  form. 

In  the  given  case,  since  the  bases  eu  . . . ,  en  and  e\  . . . ,  en  are 
reciprocal,  we  have  for  the  vector  xj  =  x'.e[  +  ...  +  x"e„ 


el(xl)  =  x)el(e[)+  ...  +xilei(el)+  ...  +  x*e‘  (en)  =  x‘  (3) 


Now  in  place  of  (1)  we  can  write 

©(*,,  =  ...  ike‘l(x ,)  . . .  e'k(xk)  (4) 


Formula  (4)  shows  that  an  arbitrary  multilinear  form 
(o(x!,  ....  xh)  can  be  expanded  into  a  sum  of  products  of  inde¬ 
pendent  forms  e'(x),  ....  en(x)  in  precisely  the  same  way  that  the 
tensor  co  of  this  form  is  expanded  into  a  sum  of  products  of  first- 
order  tensors  e\  ...,  en.  This  expresses  the  familiar  isomorphism 
between  multilinear  forms  and  their  tensors.  For  this  reason,  every¬ 
thing  that  has  been  said  about  multivectors  can  be  carried  over 
directly  to  outer  forms. 

However,  two  points  must  be  kept  in  mind. 

(1)  In  Sections  1  to  4  we  dealt  with  contravariant  multivectors. 
We  now  deal  with  multilinear  forms  to  which  correspond  covariant 
tensors.  Therefore,  first  we  have  to  define  covariant  multivectors. 
This  definition  should  be  done  in  exact  analogy  with  the  definition 
of  contravariant  multivectors:  a  skew  covariant  tensor  is  called  a 
covariant  multivector,  and  a  skew  tensor  is  defined  to  coincide 
with  its  alternation. 

(2)  The  alternation  of  forms  is  defined  here  without  any  direct 
analogy  with  the  alternation  of  tensors.  That  is  why,  in  Section  5, 
we  introduced  the  symbol  (  )  instead  of  [  ].  Therefore  we  must 
prove  the  following  assertion. 

//  u)(ati . Xh)  is  an  arbitrary  multilinear  form  (1),  and  w  is 

its  tensor  (2),  then  the  tensor  of  the  alternation  of  this  form  is 
equal  to  the  alternation  of  its  tensor  (that  is,  the  form 
(,u>(xi,  ....  .vA))  has  the  tensor  [co]) . 


OUTER  FORMS  AND  COVARIANT  MULTIVECTORS 


381 


§  7] 


Proof.  The  alternation  of  a  form  is  defined  in  Section  5  with  the 
aid  of  permutations  of  the  arguments  . . .  x,„  namely, 

(©(*,,  . . =  £*©(*/,,  . . Xjh ) 


whence  and  also  from  (4)  we  iiave 

<®(* . •••  «'*(*/*)  (5> 


The  indices  4,  4  in  (5)  assume  all  values  from  1  to  n,  the  in¬ 

dices  / 1,  ....  jh  form  all  possible  permutations  of  the  numbers 

1 _ k. 

On  the  other  hand,  from  the  definition  of  the  alternation  of  a 
tensor  and  due  to  formulas  (8)  and  (9),  Section  1,  we  have 


(0, 


g'l 

'A 


(6) 


Here  all  the  indices  t'i,  ....  4,  ai . ah  independently  run 

through  all  values  from  1  to  n. 

We  have  to  prove  that  (6)  yields  the  tensor  of  the  multilinear 
form  (5).  We  know  that  (6)  is  the  tensor  of  the  multilinear  form 

If  Z  i  •  •  •  ‘k6'a\ '  •  ■  Xe°  ‘(*i)  •  •  •  eh  (xk )  (7) 


It  is  therefore  necessary  to  establish  the  coincidence  of  the  multi¬ 
linear  forms  (5)  and  (7).  Clearly  it  suffices  to  verify  the  equation 


X  :::  kei'{xi)  •  •  •  e‘k  {xik) = S  6‘a\  v:. «/'  (*■)  •  ■  • e<1"  A)  (8) 


for  any  fixed  values  of  the  indices  i'i,  . . . ,  4,  assuming  that  they 
are  all  distinct  (otherwise  (8).  yields  0  =  0).  In  the  right-hand 
member  of  (8)  only  those  terms  are  nonzero  for  which  the  indices 
on,  ....  a*  form  a  peimutation  of  4,  4-  For  this  reason,  the 

right  and  left  members  of  (8)  have  the  same  number  k\  of  nonzero 
terms.  There  is  a  natural  one-to-one  correspondence  between  them. 

Precisely,  for  let  j\  ...  4  be  an  arbitrarily  chosen  permutation 
of  the  numbers  1,  . . . ,  k.  Then  by  permuting  the  forms 
e  '(xii)’  e'k(xik)  'n  aPProPr'a^e  lorm  of  the  left  member 
of  (8)  we  get  a  new  arrangement: 

e{'(xit)  ...  e'*(^)  =  c,,,(.vl)  ...  e"Hxk)  (9) 

Associated  with  it  is  a  very  definite  term  of  the  right-hand  member 
of  (8).  This  correspondence  is  one-to-one  since  distinct  permuta¬ 
tions  of  (/ 1,  ...,  jh)  yield  in  (9)  distinct  permutations  of 
(ai . a*). 


382 


MULTIVECTORS  AND  OUTER  FORMS 


[CH.  X 


It  remains  merely  to  note  that  the  coefficients  of  the  appropriate 
terms  on  the  left  and  right  side  of  (8)  are  equal.  But,  indeed,  when 
passing  from  the  left  member  of  identity  (9)  to  its  right  member, 
we  permute  the  factors  ?'>(*/,),  ...,  in  such  a  fashion  that 

the  number  labels  of  the  arguments  increase.  This  same  permuta¬ 
tion  of  factors  carries  the.  indices  (i\,  ....  4)  into  the  indices 
(ai,  . . . ,  a h)  ■  Hence 


4 

ak 


(10) 


since,  as  has  already  been  mentioned,  the  permutations  on  the 
right  and  left  of  (10)  are  of  the  same  parity.  Thus,  we  have  estab¬ 
lished  the  truth  of  (8),  whence  it  follows  that  [co]  is  the  tensor  of 
the  form  (©(*1,  ....  Xi,)). 


2.  From  now  on  there  is  no  need  to  retain  the  symbol  (  )  and  in 
future  we  will  denote  the  alternation  of  a  multilinear  form  by 
square  brackets  just  as  we  do  the  alternation  of  a  tensor;  that  is, 
we  assume 

M*i.  . . .,  **)]  =  <©(*! . xk))  (11) 


3.  Now  we  can  carry  over  directly  to  forms  the  basic  results 
and  relations  established  for  multivectors  in  Sections  1  to  4. 

(1)  Given  the  linear  forms  (each  in  one  argument)  Mi(*i), 
«2(x2),  . . . ,  uh(xh).  The  alternation  of  their  product  can  be  written 
thus: 

[«,  (*i)«2(x,)  ■  •  •  uk  (**)]  =  -jr  ^  V  ...  kuit(x  i)  •  •  •  uik(xk)  (12) 


Note  that  in  (12)  the  arguments  Xi . xh  in  all  terms  are  ar¬ 

ranged  in  a  natural  sequence. 

In  particular,  for  the  basis  forms  e\  ...,  en,  the  total  number 
being  any  number  k  and  with  any  kind  of  arrangement,  we  have 


[<■'■('.)  •  •  ■  «'*(*.)]  “  TT £  6i; V' (*')  '  •  •  e'*  (**)  (13) 


(2)  If  ©(*1,  . . . ,  xh)  is  any  form  written  as  (4),  then 

M*i . **)J  =  Z<«V"  4 [*''(*,)  •••  e'*(Jf*)] 

=  £<■>[<,... /*]*'*(*,)  •••  elHxk) 

In  this  connection,  see  Subsection  15,  Section  1. 

(3)  Skew  forms  may  be  described  by  the  condition 


O); 


=  © 


l‘i  •••  4] 


OUTER  FORMS  AND  COVARIANT  MULTIVECTORS 


383 


S?l 

that  is,  the  condition  of  skew  symmetry  of  their  coefficients  with 
respect  to  any  pair  of  indices  (see  Section  1,  Subsection  18). 

(4)  The  outer  product  of  several  basis  forms  ^(x),  en(x), 
taken  in  any  quantity  and  in  any  order,  may  be  expressed  by  the 
formula 

el  (*i)  A  e‘2(x2)  A  ...  A  e‘k  (xk)  =  k\  [e1'  (*,)  e*  (x2)  ...  e‘k  (x*)]  (14) 

Formula  (14)  may  be  proved  like  (7),  Section  2,  since  we  can 
understand  alternation  in  the  sense  of  formula  (13)  (which  is 
constructed  in  the  same  way  as  (9)  of  Section  1). 

4.  From  formulas  (3),  (13)  and  (14)  we  get 
«'•(*.)  A  e'2(>2)  A  ...  A  e'*  (**)  =  £  }**'*  . . .  =  V‘>  -  '* 

(15) 

Here 


x\l  • 

x‘k 

.  .  A, 

^4'  • 

x‘lt 
•  •  xk 

is  a  minor  of  order  k  of  the  matrix 


made  up  of  the  components  of  the  vectors  x\,  . . . ,  xh.  The  indices 
iu  . . . ,  ih  indicate  the  numbers  of  the  columns  of  matrix  X  that 
participate  in  the  minor  The  word  “minor”  is  used  in  a 

conventional  sense  since  it  is  not  assumed  that  the  indices 
it,  . . . ,  ih  proceed  in  increasing  order. 

Formula  (15)  yields  numerical  values  of  one-member  outer 
forms,  which  are  outer  products  of  the  basis  forms  el(x),  ... ,  en{x). 
Like  any  outer  forms,  these  one-member  forms  are  functions  of  a 
simple  multivector;  in  the  given  case,  of  the  multivector  p  — 
—  X\  A  x2  A  . . .  A  xh.  The  numbers  F'1  ‘k  coincide  with  the  com¬ 
ponents  of  the  simple  multivector  p  (sec  Section  4,  Subsections  5, 
6).  If  the  space  L  is  equipped  with  a  Euclidean  metric  and  the 

basis  . . .  is  orthonormal,  then  l/(|  for  i'i  <  is  <  . . .  < 

<  ih,  are  oriented  volumes  of  the  projections  of  the  multivector  p 
on  the  coordinate  planes  (see  Section  4,  Subsection  5). 

5.  An  arbitrary  outer  form  w'M p),  p  =  xt  A  x2  A  . . .  A  xh,  may 
be  expressed  in  the  following  manner,  in  terms  of  the  basis  forms 


384 


MULTIVECTORS  AND  OUTER  FORMS 


(CH.  X 


(as  in  Subsection  7,  Section  2): 

<*>*(p)  =  *  Z  ©<,  •••  Ui)  A  ...  Ae'k(xk)  (16) 

Recall  that  the  starred  sigma  denotes  summation  with  the  proviso 
that  »i  <  «2  <  •  •  •  <  4- 
From  (15)  and  (16)  we  have 

co*  (p)  =  *  5]  (16a) 

6.  Since  the  values  of  the  linear  form  e'  on  an  arbitrary  vector 
x  =  x'ei  +  . . .  +  *nen  is  equal  to  the  component  a',  the  symbol  x f 
is  used  to  denote  the  linear  form  e'(*)=  a',  and  instead  of  (16) 
we  write 

®‘(p)  =  ‘Zffli1...^'lA^A  ...  A  xl*  (17) 


7.  For  an  outer  form  of  degree  two  (or,  as  we  sometimes  say, 
an  outer  quadratic  form),  the  expansion  (16)  in  three-dimensional 
space  looks  like  this: 

«B  =  cD|Le'(A-1)  A  e?{x2)  +  (D23e2(A,)  A  «3(a2)  +  ffi|3e'  (a,)  A  e3(x2)  (18) 

Besides,  taking  into  account  Subsection  6,  we  can  also  write 

(0  =  A  X2  +  (D^A2  A  -V3  +  <*>i3x'  A  a3  (19) 


Both  (18)  and  (19)  express  the  same  thing  in  different  notations. 
Formulas  (17)  and  (19)  are  conventional  and  are  apt  to  give  rise 
to  misunderstanding.  When  using  them,  one  has  to  remember  that 
a1,  a2,  ...  are  not  the  components  of  vectors  but  denote  linear 
forms;  for  instance, 


a1  A  a2  —  e1  (a,)  A  e2  ( x2 )  = 


«'(a.)  e*(*i) 
e'(X2 )  e2(X2 ) 


This  is  more  clearly  brought  out  in  a  numerical  case.  Let 
a,  =*  x\el  +  x2e.2  +  a3c3  =  2e{  +  Ze2  +  A3e3, 

a2  =  x\e{  +  x\e2  +  A3e3  =  —  le,  +  5e2  +  x\e3 

Then 


A1  A  A2  = 


2  3 
-1  5 


=  13 


Here  we  have  found  the  numerical  value  of  the  form  a1  A  a2  = 
=  e1  A  e2  on  a  specific  pair  of  vectors.  But  one  must  bear  in  mind 
that  the  form  el  A  e2  itself,  as  an  element  of  Grassmann  algebra, 
is  not  a  number. 


OUTER  FORMS  AND  COVARIANT  MULTIVECTORS 


385 


5  71 

8.  In  integral  calculus  and  in  the  theory  of  differential  equations 
use  is  made  of  outer  forms  whose  arguments  are  the  differentials 
of  variables.  Then  in  expressions  of  the  type  (17)  or  (19)  one 
writes  dx\  dxn  instead  of  xn.  For  instance,  one  fre¬ 

quently  encounters  an  outer  form  of  type  (19)  in  the  notation 

to  —  Pdx[  A  dx-  +  $  dx-  A  dx'  +  R  dx3  A  dx' 


where  the  coefficients  P  =  o>i2.  Q  =  <023,  R  —  — toi3  are  them¬ 
selves  functions  of  the  arguments  x\  x2,  x3. 


9.  An  outer  form  in  the  notation  of  (17)  is  convenient  in  cases 
where  one  has  to  pass  to  a  new  basis  and  the  old  components  are 
given  in  terms  of  the  new  ones: 


x1  =  pW+  .. 

•  +Pl'Xn',  | 

xn  =  Pl’x'+  . 

•  +P'n’Xn'  J 

(20) 


Then  we  have  the  following  equation: 
x‘‘  A  x‘2  A  ...  A  *ift=  *  Z  D‘\  •  •  V1  A  x1'2  A 

h  Ik 


A  A  (21) 


where  D1)  "1)  is  the  minor  of  the  matrix 

/i  •••  Ik 


intersection  of  rows  labelled  ilt 

i'l . ik(i\  <  if2  <  <  fl¬ 


Pv 

...  Pn' 

Pv 

...  Pn' 

at  the 

ik  and  columns  labelled 


Formula  (21)  is  proved  by  termwise  multiplication  of  the  linear 
forms  x1',  ....  x‘k,  which  are  to  be  viewed  as  linear  combinations 

,/  ,r 

of  the  forms  x‘l . x'n  in  accord  with  (20).  We  then  get 


A  A  . . .  A  <'*  -  (Z  P^')  A  ...  A  (2  P\ f/k) 


=  * 


f‘?A  a  •  • .  a  A 

= •  2 d‘,\ :: '?/  a  ...  a*i! 


Remark.  Formula  (21)  can  be  written  immediately  on  the  basis 
of  Subsections  5,  6,  Section  4.  In  formula  (5)  of  that  section  it 
suffices  merely  to  take  the  forms  x‘k  for  the  vectors 

bu  . . . ,  bh  and  the  forms  xr . xn'  for  the  vectors  e\ . en. 


13- 


386 


MULTIVECTORS  AND  OUTER  FORMS 


(CH.  X 


10.  We  conclude  this  section  with  three  important  theorems  on 
outer  forms  that  follow  from  definitions  and  the  results  of  Sec¬ 
tion  3  by  replacement  of  L  by  L*. 

Theorem  1.  The  rank  r  of  any  outer  form  of  degree  two  is  an 
even  number  that  does  not  exceed  the  dimension  of  the  space 
(r  =  2m  n). 

Theorem  2.  If  co2  (jc  A  y)  is  an  outer  form  of  degree  two  of  rank 
r  =  2m,  then  there  is  a  system  of  independent  linear  forms 
Pu  •  •  ■ ,  Pm,  . . .  such  that 

to2  (x  A  y)  —  Pi  (*)  A  <?i  iy)  +  ...  +  pm  (*)  A  qm  (y) 

Theorem  3.  (Cartan’s  lemma  for  outer  forms).  Let  pi(x),  ... 

. . . ,  ps(x),  qi(x) . q«(x)  be  linear  forms  and  let  pi(x) . ps(x) 

be  independent.  To  have 

Pi  M  A  <7i  (y)  +  ...  +ps  (*)  A  qs  ( y )  =  0 

it  is  necessary  and  sufficient  that  there  exist  expansions  of  the 
type 

qi  (x)  =  anpt  {x)  +  ... 

where  a,j  =  ajj  (a,j  are  numerical  coefficients). 

§  8.  Outer  forms  in  three-dimensional  Euclidean  space 

1.  To  give  a  geometric  illustration  of  the  material  of  the  preced¬ 
ing  two  sections  we  will  consider  outer  forms  in  three-dimensional 
Euclidean  space  E3  and  we  will  show  that  they  are  closely  related 
to  the  familiar  operations  of  elementary  vector  algebra.  We  assume 
that  eu  e2,  e3  constitute  an  orthonormal  basis,  that  a,  b,  p,  q  are 
fixed  vectors,  and  that  x,  y,  z  are  variable  vectors  ranging  over  the 
whole  of  E3.  We  denote  the  vector  components  by  lower  indices 
since  the  position  of  the  indices  is  immaterial  due  to  the  orthonor¬ 
mality  of  the  basis  (in  such  a  basis  the  contravariant  components 
are  equal  to  the  covariant  components  of  a  vector,  see  Chap¬ 
ter  VIII) . 

By  Subsection  4,  Section  6,  we  have  to  consider  fe-forms  only 
for  k  =  1,  2,  3. 

2.  Recall  (from  Chapter  VIII)  that  any  linear  form  in  Euclidean 
space  may  be  represented  as  a  scalar  product  of  a  constant  vector  a 
into  a  variable  vector  x,  and  every  vector  a  e  E3  determines  a 
linear  form  (a,  x),  which  we  now  denote  by  a>'a(x): 

(*)  =  (<*.  x)  =  a{x{  +  a2x2  +  a3x3  (1) 

Formula  (1)  establishes  a  linear  isomorphism  between  E3,  which 
is  regarded  as  a  set  of  vectors  a,  and  the  space  of  linear  forms. 


S  8]  OUTER  FORMS  IN  THREE-DIMENSIONAL  SPACE  387 

•  Putting  a  =  eit  we  get  the  basis  linear  forms  a>'e  ,  which  we  ab¬ 
breviate  to  aj: 

co)(*)  =  (ef,  *)  =  *,,  <=1,2,3  (2) 

% 

Formula  (1)  may  be  viewed  as  an  expansion  of  the  form  ©J,  rela¬ 
tive  to  the  basis  (2)  and  we  can  rewrite  it  thus: 

=  a,w|  -f  a2a>'2  +  a3<a'3  (3) 

3.  Let  us  consider  the  mixed  product  (a,  x,  y)  in  which  the  first 
factor  is  fixed  and  the  other  two  vary.  It  is  linear  in  each  of  the 


Fig.  75 


arguments  and  skew-symmetric  in  x  and  y  ((a,  y,  x)  = —(a,  x,  y)), 
so  that  it  is  a  2-form,  which  we  denote  by  u>2a(x  A  y).  From  ele¬ 
mentary  analytic  geometry  we  know  that  the  numerical  value  of 
©a(*A  y)  is  equal  to  the  product  of  the  area  5  of  the  bivector 
x  A  y  by  the  length  |a|  of  vector  a  and  by  the  cosine  of  the 
angle  a  between  the  given  vector  a  and  the  oriented  (in  standard 
fashion)  normal  of  the  bivector  x  Ay  (Fig.  75).  Thus 

fi>2(*  A  y)  =  (a,  x,  y)  =  S\a\cosa  (4) 


Denote  by  X  the  matrix  made  up  of  the  components  of  vectors  x 
and  y: 

"  *1  *2  *3 

y\  !h  !h 


X  = 


and  set 


Vi 


Xi  X, 

yi  yi 


i,  j  ■ —  1 ,  2,  3,  i j 


(5) 


Then  by  the  familiar  formula  for  a  mixed  product  we  have 

u2a  (x  A  y)  =  a{Vi3  +  a/3l  +  a^V (6) 


13* 


388 


MULTIVECTORS  AND  OUTER  FORMS 


[CH.  X 


We  know  (see  Section  7,  formulas  (16)  to  (19))  that  an  ar¬ 
bitrary  2-form  a2(x  A  y)  in  E3  looks  like  this: 

(02(x  A  y)  =  (0i2Vl2  +  ©13^13  +  ®23^23  (7) 

Take  vector  a  with  components 

al~©23>  a2  ~  ©I3>  a3  ~  ©12  (8) 

Then  the  right  members  of  (6)  and  (7)  coincide. 

Conclusion.  Any  outer  form  of  degree  two  in  three-dimensional 
Euclidean  space  can  he  represented  as  a  mixed  product  ( a,x,y ), 
given  an  appropriate  choice  of  the  fixed  vector  a,  which  is  uniquely 
defined  by  formulas  (8). 

4.  Setting  a  =  eit  we  get  the  2-form  to2  ,  which  we  abbreviate 
to  ©2  (t=l,  2,  3).  Using  (2),  (5)  and  (6)  and  taking  into  ac¬ 
count  Subsections  5  to  7,  Section  7,  we  can  write 

(x  Ay)  =  V23  =  —  V32  =  ©2  (x)  A  ©3  (y),  I 

©z(*  A  y)  =  V3l  =  —  V l3  =  ©£(*)  A  ©j(y),  (9) 

©2  {x  A  y)  —  V i2  =  —  V2l  —  ©j  (x)  A  ©£  (y)  J 

The  numerical  value  of  each  of  the  2-forms  of  (9)  is  equal  to  the 
area  of  the  projection  of  the  bivector  x  A  y  on  the  corresponding 


coordinate  plane.  Geometrically  it  is  clear  that  this  should  be  so. 
For  example,  the  product  of  the  area  of  the  bivector  x  A  y  by  the 
cosine  of  the  angle  \  between  the  basis  vector  e3  and  the  normal  of 
the  bivector  x  A  y  is  equal  to  the  area  of  the  projection  of  the  bi¬ 
vector  x  A  y  on  the  coordinate  plane  E\3  spanned  by  the  vectors  e\ 
and  e3  (see  Fig.  76).  The  length  of  the  vector  participating  in  (4) 
is  in  this  case  equal  to  unity  (a  —  e3). 


*8] 


OUTER  FORMS  IN  THREE-DIMENSIONAL  SPACE 


389 


Using  (9),  we  can  rewrite  the  expansion  (6)  thus: 
c o2a  =  a ,ov;  +  a./o*  +  <V°i  =  0,(1).]  A  ©3  +  a2co'  A  ©j  +  a3©j  A  ©2  (10) 

Formula  (10)  establishes  a  linear  isomorphism  between  the  set 
of  vectors  a  and  the  space  qj  2-forms  with  arguments  in  £3, 
namely,  to  the  addition  of  forms  ©fl  and  mI  there  corresponds  an 
addition  of  vectors: 


to  the  multiplication  of  forms  by  a  scalar  there  corresponds  a  mul¬ 
tiplication  of  the  vector  by  that  scalar: 


5.  A  mixed  product  of  three  variable  vectors  ( x ,  y,  z)  is  an  outer 
form  of  degree  three  which  we  denote  by  ©•]: 

©1  (x  A  y  A  z)  ==  ( x ,  y,  z) 

Its  value  on  the  multivector  x  A  y  A  z  is  equal  to  the  oriented 
volume  V  of  that  multivector.  Since  the  space  of  outer  forms  of 
degree  three  is  one-dimensional  ( k  —  n,  in  this  connection,  see 
Chapter  V,  Section  8),  we  have 

©3(*.  y,  z)  =  A.©3  (x  A  y  A  z)  =  M*.  y,  z) 

That  is  to  say,  an  arbitrary  outer  form  of  degree  three  is  propor¬ 
tional  to  the  mixed  product  of  its  arguments  and  is  uniquely  de¬ 
termined  by  the  numerical  factor  b 


6.  We  now  show  that  to  the  outer  multiplication  of  linear  forms 
Is  associated  the  vector  multiplication  of  vectors,  namely,  the  fol¬ 
lowing  formula  holds: 

©a  (x)  A  ©j  (y)  =  ©fa  x  6,  (x  A  y)  (11) 


where  [a  X  b]  as  usual  denotes  a  vector  product. 

Proof.  Using  the  properties  of  outer  multiplication  given  in  Sub¬ 
section  9  of  Section  6,  we  find 


©i  A  ©6  =  (ai®!  +  fl2©2  +  fl3©3)  A  (ft,©!  +  *2©2  +  ^©3) 


a2  a3 

b2  b3 


©‘  a  ©3  + 


a3  a, 
b3  b , 


to.1,  A  ©|  + 


a,  a2 

b\  b2 


©j  A  ©2 


(12) 


Comparing  (10)  and  (12)  we  get  (11). 

Note  that  from  the  algebraic  viewpoint  the  computation  (12) 
coincides  with  the  derivation  of  the  formula  for  a  vector  product  ip 


390 


MULTIVECTORS  AND  OUTER  FORMS 


[CH.  X 


terms  of  the  components  of  the  factors,  which  derivation  is  familiar 
from  the  elementary  course  of  analytical  geometry. 

7.  Formula  (11)  permits  of  another  proof  that  an  arbitrary 
2-form  o)2(x  A  y)  in  £3  can  be  represented  in  the  form  of  a  mixed 
product  (a,  x,  y ).  From  Theorem  2  of  Subsection  10,  Section  7, 
there  are  two  linear  forms,  which  by  Subsection  2  of  this  section 
we  can  write  as  coj,  and  co^,  such  that 

©2(*  A  y)  =  (o'p(x)  A  <o'v(y)  (13) 

Putting  a  =  [p  X  Q],  from  (4),  (11)  and  (13)  we  get 
co2(*  A  y)  =  co2  (*  A  y)  =  (a,  x,  y) 

8.  Associated  with  the  outer  multiplication  of  a  linear  form  into 
a  2-form  is  a  scalar  multiplication  of  vectors,  namely: 

©i (*)  A  ®l{y  A  z)  =  (a,  b)  (x  A  y  A  z)  =  (a,  b) (x,  y,  z )  (14) 

Indeed,  by  (3)  and  (10)  we  have  the  expansions 
©i  =  a,«>]  +  a2(&l2  -f  a3a>'3, 
a>l  =  by2  A  wj  +  b2<s>\  A(o|  +  b3<a\  A  g>'2 

Multiplying  the  expressions  (15)  termwise  and  using  the  proper¬ 
ties  of  outer  multiplication,  we  get 

A  to  1  =  ( albi  +  a2b2  +  a3b3 )  (<oj  A  co>  A  (16) 

As  in  the  case  of  formula  (15),  Section  7,  it  can  be  shown  that  the 
value  of  the  3-form  ©j  A  ©2  A  ©3  on  the  multivector  x  Ay  Az  is 
equal  to  a  determinant  composed  of  the  components  of  the  vectors 
x,  y,  z,  that  is, 

©,'  A  ©!  A  ©3  ==  ©f 
From  (16)  and  (17)  follows  (14). 


(17) 


Chapter  XI 


QUADRIC  HYPERSURFACES 

« 


§  1.  The  general  equation  of  a  quadric  hypersurface 

1.  Given  a  real  n-dimensional  affine  space  9ln  and,  in  it,  a  sy¬ 
stem  of  affine  coordinates  with  origin  0  and  basis  . . . 

A  quadric  hypersurface  in  9In  is  a  locus  of  points  M  e  91  „ 
whose  radius  vector  x  —  OM  satisfies  the  equation 

a  (x,  x)  -f  2b  (jc)  +  c  =  0  (1) 

where  a(x,  x)  is  a  quadratic  form,  b(x)  is  a  linear  form,  and  c  is 
a  constant.  The  forms  a(x,  x)  and  b(x)  are  assumed  to  be  inva¬ 
riant  under  a  change  of  basis. 


2.  If  we  put  x  =  OM  —  X\e\  -J—  . . .  — (—  xnen,  then  (1)  can  be 
given  in  coordinate  notation: 

+  +  f  =  0  (2) 

Here,  xu  . . .  ,  xn  are  the  coordinates  of  the  point  M.  They  are  called 
the  running  coordinates  and  M  is  any  point.  The  quadratic  form 
a  {x,  x)  —  2  is  called '  the  group  of  higher-degree  terms  of 

equation  (1)  or  (2).  The  linear  form 

2b(x)  =  2Zbtx, 

is  called  the  group  of  first-degree  terms.  The  constant  c  is  called 
the  constant  term  of  the  equation. 

Remark.  Throughout  this  chapter  wc  will  denote  coordinates  by 
lower  indices  since  tensor  algebra  will  not  be  used  at  all. 

3.  It  may  happen  that  for  a  certain  equation  of  type  (2)  there 
will  not  be  a  single  point  satisfying  it  in  the  real  space  9l„.  Even 
so  we  will  say  that  such  an  equation  is  an  equation  of  a  quadric 
hypersurface.  Sometimes  the  hypersurface  is  said  to  be  imaginary 
(or  zero).  For  instance,  we  say  that  the  equation  x2  +  y2  +  z2  +' 
+  1  =  0  is  the  equation  of  an  imaginary  sphere  (in  Euclidean 


392 


QUADRIC  HYPERSURFACES 


|CH.  XI 


space  with  a  system  of  rectangular  Cartesian  coordinates  x,  y,  z). 
These  words  naturally  are  geometrically  meaningless  as  long  as 
we  remain  in  real  space.  However,  such  unified  terminology  is 
convenient  from  the  formal  algebraic  viewpoint,  since  the  subject 
of  the  theory  to  which  this  chapter  is  devoted  is  actually  more  the 
equations  themselves  than  the  hypersurfaces.  In  the  theory  of  these 
equations  it  is  best  not  to  lose  sight  of  any  cases,  firstly,  because 
it  is  not  clear  beforehand  whether  the  equation  defines  some 
nonempty  set  of  points  or  not;  secondly,  even  when  it  defines  an 
empty  set,  the  left-hand  member  of  the  equation  may  have  some 
kind  of  mechanical  or  physical  meaning. 

4.  It  can  be  proved  that  in  a  complex  affine  space,  any  equation 
of  type  (2)  defines  a  nonempty  set  of  points.  However  we  will  con¬ 
fine  ourselves  to  real  affine  space  2l„.  Only  in  a  few  cases  will  we 
speak  of  complex  points  (for  instance,  if  a  simultaneous  solution 
of  the  equations  of  a  straight  line  and  a  hypersurface  lead  to  com¬ 
plex  values  of  the  desired  coordinates). 

5.  Equation  (2)  with  literal  coefficients  is  called  the  general 
equation  of  a  quadric  hypersurface.  It  contains  y(«  +  1 )  (m  -f-  2) 

terms.  When  n  is  large,  this  number  becomes  very  great.  It  is 
therefore  difficult  to  investigate  directly  a  hypersurface  on  the 
basis  of  its  equation  written  in  an  arbitrary  system  of  coordinates. 

Procedures  will  be  given  later  on  that  will  permit  reducing  the 
general  equation  (2)  to  certain  special  forms  where  the  equation 
is  incomplete  and  is  called  canonical. 

§  2.  Changes  in  the  left  member  of  the  equation  under  translation 
of  the  origin 

1.  Denote  by  the  symbol  2 F  the  entire  left-hand  member  of  the 
equation  of  a  quadric  hypersurface,  viewing  F  as  a  function  of  the 
running  coordinates: 

2F{xu  ....  xn)  =  D  alkxtxk  +  2  X  bixl  +  c 

By  Section  2,  Chapter  III,  when  the  origin  is  translated,  the  coor¬ 
dinates  vary  in  accord  with  the  formula  xl  =  xi  -f  x®,  where  xt  are 
the  old  coordinates  of  an  arbitrary  point,  xt  are  the  new  coordi¬ 
nates  of  that  point,  and  x°t  are  the  coordinates  of  the  origin  in  the 
old  coordinate  system. 

Substitute  these  expressions  into  the  function  F  and  regroup  the 
terms,  collecting  those  to  the  second  and  first  powers  of  the  new 
coordinates.  In  the  process,  make  use  of  the  symmetrical  nature 


§  2) 


TRANSLATION  OF  THE  ORIGIN 


393 


of  the  matrix  of  the  quadratic  form  ( aih  =  ahi)\ 

2/r(jcr . *,)-Z (*,  +  O  (*»  +  *!)  +  2  Z  (*,  +  *!)  +  c 

=  Z  a, kZ,xk  +  2  Z  (?  i;  »,)*,  +  (Z  «„*?«•  +  2  £  6|Jtj  +  c) 


If  we  write  the  function  F  in  new  coordinates  in  accordance 
with  the  old  standard 


then 


2 F='Z  aikxixk  +  2  £  Mi  +  c 


&ik  —  atk> 

bt  =  (Z  a,fe*2)  +  b. 

c  =  £  aikx°ixl  +  2  £  M?  -f  c 


(I) 

(II) 

(III) 


2.  The  quantity  c  is  the  left-hand  member  of  the  original  equa¬ 
tion  in  which  the  coordinates  of  the  new  origin  have  been  sub¬ 
stituted  in  place  of  the  coordinates  of  the  running  point: 


c  —  2F  (x®, 


Note  that  formula  (II)  may  be  written  differently: 


-C7'W* 


<) 


Here,  the  partial  derivative  of  F  with  respect  to  the  argument  Xi 
is  computed  from  the  coordinates  of  the  new  origin  (x® . x®). 


3.  To  make  the  notation  of  formula  (I)  to  (III)  more  compact, 
we  introduce  the  matrices 


«n 

•  •  Gln^l 

flll  •  • 

•  a\n 

B  = 

» 

an\  • 

•  •  annbn 

Ct  rl  I  •  • 

•  cinn 

b\  ■ 

•  ■  bnc 

Matrix  B  and  other  matrices  of  order  n  -f-  l  will  be  indicated  by 
a  special  type  of  print. 

Relation  (I)  in  matrix  form  becomes 

A  =  A  (la) 

Using  an  artificial  device,  it  is  convenient  to  write  the  function  F 
as  a  quadratic  form.  To  do  so,  we  introduce  a  supplementary  coor¬ 
dinate  xn+i,  using  it  as  a  conventional  symbol  and  assuming  that 

*n+l  =  I- 


394 


QUADRIC  HYPERSURFACES 


[CH.  XI 


Besides,  we  assume  that  bi  —  at  n+i  —  an+\  i,  c=a„+in+1. 
Then  the  formulas  xk  =  xk  -f  x°k  may  be  written  as 


*i  =  *i+  +  *?*„+>• 

*2  =  *2  +  +  *S*B  +  |. 

=  *« +  <*»+!• 


(1) 


The  last  equation  shows  that  xn+i  —  1,  just  like  x„+i. 

Because  of  the  introduction  of  this  supplementary  coordinate,  all 
the  formulas  for  transformation  of  coordinates  become  homoge¬ 
neous  (without  constant  terms).  What  is  more,  the  left-hand  mem¬ 
ber  of  equation  (2),  Section  1,  becomes  homogeneous  too.  In  the 
new  notation  we  can  then  write 


n  n 

2F  =  Y,  aikxtxk  +  2  £  n  +  \^l^n  +  \  “f"  ^ti  +  1  n+\^n+\^n 

i,  k=l  /=! 


n+ 1 

=  £  alkxtxk 

i,k=  i 


What  we  have  is  a  quadratic  form  with  the  matrix  B. 

The  matrix  of  transformation  (1)  expresses  the  old  coordinates 
in  terms  of  new  ones  and,  in  accordance  with  the  standard,  must 
be  denoted  as  P*: 

1  o 

1  X°2 

p*=  • .  : 


By  regarding  the  function  F  as  a  quadratic  form  of  the  argu¬ 
ments  x\,  ....  jc„+i.  we  can  apply  the  familiar  formula  for  trans¬ 
formation  of  the  matrix  of  a  quadratic  form  to  get  the  matrix  equa¬ 
tion 

B=PBP*  (2) 

which  embraces  all  the  formulas  (I)-(IIl). 

The  matrix  formula  (2)  permits  stating  an  important  theorem. 


CHANGES  IN  THE  ORTHONORMAL  BASIS 


395 


>31 

Theorem  1.  The  determinant  of  matrix  B  and  its  rank  are  in¬ 
variants  under  a  translation  of  the  coordinate  origin,  that  is, 

det  B  =  det  B,  rank  B  =  rank  B 

Proof.  It  is  immediately  apparent  that  det  P*  =  det  P  =  1,  and 
so  the  theorem  follows  from  formula  (2). 

Remark.  When  the  origin  is  translated,  matrix  A  is  itself  an  in¬ 
variant,  which  means  all  its  elements  are  preserved. 


§  3.  Changes  in  the  left  member  of  the  equation  for  a  change 
in  the  orthonormal  basis 


1.  So  as  not  to  complicate  computations,  we  henceforth  assume 
that  we  are  in  Euclidean  (point)  space  and  make  use  of  ortho¬ 
normal  bases. 

Suppose  we  are  changing  from  one  orthonormal  basis  to  another 
one: 


el==/llel+  • 

•  •  +  Inien,  | 

+ 

••  +Innen  \ 

The  orthogonal  matrix  /  =  ||/,j||  is  written  so  that  the  first  index 
increases  along  the  row. 

The  coordinates  of  the  points  transform  via  the  formulas 


The  matrix 


—  /  x' 
i  'iri 

+  .. 

•  +  7I  n 

<>  ] 

+  •  • 

+  Kn 

<\ 

•• 

r= 

.  • 

.  .  . 

/,lt  • 

•  •  Am 

(1) 


is  also  orthogonal. 

Formulas  (1)  are  homogeneous  (they  do  not  contain  constant 
terms)  since  the  origin  remains  fixed. 


2.  We  now  pass  to  new  coordinates  in  the  left  member  of  equa¬ 
tion  (2),  Section  1,  via  the  formulas  (1).  We  have 

2F  (x,,  ....  x„)  =  Yj  aikxixk  “t"  ^  £  b ixi  c 

where  the  primes  indicate  new  coefficients.  Due  to  the  homogeneity 
of  formulas  (1),  groups  of  terms  of  different  powers  transform  se¬ 
parately.  In  particular,  the  constant  term  remains  unaltered: 

c'  =  c 


396 


QUADRIC  HYPERSURFACES 


[CH.  XI 


Similarly 

X  aikxtxk  ~  S  aikxixk 
whence,  in  matrix  notation,  we  get 

A'  =  1AI' 

Since  matrix  /  is  orthogonal,  its  transpose  is  equal  to  the  in¬ 
verse,  /*  =  /■',  and  so 

A'  =  IAI~' 

As  in  the  case  of  Subsection  4,  Section  2,  Chapter  VII,  from  this 
matrix  equation  we  get  the  following  theorem. 

Theorem  1.  When  changing  from  one  orthonormal  basis  to 
another ,  the  left  member  of  the  equation  has  as  invariants  det  A, 
rank  A,  and  the  characteristic  polynomial  p(X)  of  matrix  A. 

Remark  1.  The  polynomial  p(X)  contains  det  A  as  one  of  its  coef¬ 
ficients,  and  therefore  the  invariance  of  det  A  follows  from  the  in¬ 
variance  of  p(X). 

Remark  2.  When  p(X)  is  written  out  in  full, 

p(X)  =  (-l)n{kn-pX-'  +  P2^n~2+  ...  +(-DnpJ 

we  see  that  p\,  p2,  pn  are  invariant;  that  is,  when  passing  to 
a  new  orthonormal  basis,  the  following  are  preserved:  the  sums 
of  the  principal  minors  of  order  one  of  matrix  A,  the  sums  of  the 
principal  minors  of  order  two,  and  so  on.  Thus 

rank  A’  —  rank  A, 

Pf=P, . P'_,==P„_I.  det  A’  =  det  A 


3.  To  find  the  law  of  transformation  of  matrix  B  we  again  in¬ 
troduce  the  notation  6,  =  a*  n+i,  c  —  an+i  n+l.  Then 


fi+i 

2P=  £  aikXiXk 


i,  fc=i 


where  x„+i  =  1.  We  now  adjoin  another  equation  to  formulas  (1) 
and  abbreviate  the  resulting  formulas  to 


(1), 


n+l 


n+l 


} 


(2) 


The  matrix  of  the  transformation  formulas  (2)  is 

I  Jn  I\n  0  I 


An  Inn  0 
0  ...  0  1 


r= 


$  4]  THE  CENTRE  OF  A  QUADRIC  HYPERSURFACE  397 

It  is  easy  to  see  that 

i*=rl 

Indeed,  the  last  relation  of  (2)  can  be  inverted  in  trivial  fashion: 
it  suffices  merely  to  interchange  the  left  and  right  members.  As 
to  formulas  (1),  their  inversion's  associated  with  taking  the  trans¬ 
pose  of  the  matrix  /*. 

Thus  the  desired  formula  for  the  transformation  of  matrix  B  of 
the  quadratic  form  2 F  is 

B'  =  IBI*  or  B'-IBr1  (3) 


From  (3)  we  get 

Theorem  2.  The  determinant  and  rank  of  matrix  B  are  invariant 
under  a  change  from  one  orthonormal  basis  to  another: 

detB'  =  detB,  rank  B'  =  rank  B 

4.  From  (la)  of  Section  2,  from  Theorem  1  of  Section  2,  and 
from  Theorems  1  and  2  of  Section  3  follows  the 

Corollary.  Under  a  general  transformation  of  coordinates,  which 
consists  in  the  translation  of  the  origin  and  the  change  from  an 
old  orthonormal  basis  to  a  new  orthonormal  basis,  we  have  the 
invariants 

detB,  rankB,  det  A,  rank  A 
and  the  characteristic  polynomial  p(K)  of  matrix  A. 

§  4.  The  centre  of  a  quadric  hypersurface 

1.  Ordinarily,  the  centre  of  a  quadric  hypersurface  is  taken  to 
be  that  point  of  the  space  relative  to  which  all  points  of  the 


Fig.  77  E 


hypersurface  are  arranged  in  symmetric  pairs.  Thus,  when  we 
speak  of  the  centre,  we  have  in  mind  the  centre  of  symmetry 
(Fig.  77). 

Unfortunately,  in  real  space  this  definition  becomes  invalid  in 
cases  where  there  is  not  a  single  point  that  can  satisfy  the  equa¬ 
tion  of  the  hypersurface.  However,  in  those  cases  as  well  there 
may  be  points  which  for  algebraic  reasons  it  is  advisable  to  con- 


398 


QUADRIC  HYPERSURFACES 


[CH.  XI 


sider  as  centres.  For  example,  the  centre  of  the  imaginary  sphere 
x2  +  y*  +  z2  +  1  =  0  is  the  origin  of  the  coordinate  system.  For 
this  reason,  we  prefer  to  define  the  concept  of  the  centre  of  a 
quadric  hypersurface  differently. 

2.  Consider  the  incomplete  equation 

'LatkXiXk  +  c  =  0  (1) 

If  a  point  (jti,  ....  x„)  lies  on  the  hypersurface  (1),  then  the 
point  symmetric  to  it,  ( — Xi,  ....  — xn)  also  lies  on  the  hypersur¬ 
face  (1). 

Hence  if  there  are  points  satisfying  equation  (1),  then  the  origin 
is  the  centre  of  symmetry  of  the  hypersurface  (1). 

On  this  basis  we  give  the  following  formally  algebraic  definition 
of  a  centre. 

The  centre  of  an  arbitrary  quadric  hypersurface  is  a  point  such 
that  if  the  coordinate  origin  is  placed  at  that  point,  then  the  equa¬ 
tion  of  the  hypersurface  takes  on  the  incomplete  form  of  (1).  Thus, 
we  say  that  the  centre  is  any  point  relative  to  which  the  left  mem¬ 
ber  of  the  equation  possesses  central  symmetry  (remains  unaltered 
when  X\ . xn  is  replaced  by  — Xi . — xn). 

3.  Given  a  general  equation  of  the  second  degree: 

Z  aikXtxk  +  2  £  btxt  +  c  =  0 

We  want  to  determine  whether  there  is  a  centre  and  if  so  to  find 

it.  We  carry  the  origin  to  the  point  0(x° . x°)  and  obtain  the 

equation 

Z  aikxixk  +  2  £  biXi  +  c  —  0 

The  new  origin  will  be  the  centre  if  and  only  if 

5,-0,  /=  1,  . . n  (2) 

Equations  (2),  with  account  taken  of  equations  (II)  of  Section  2, 
yield  the  so-called  equations  of  the  centre  (equations  defining  the 
centre): 

ZvJ  +  ft,=o 

Written  out  in  full,  the  equations  of  the  centre  look  like  this: 

ai/?  +  ...  +  alnx°n  =  | 

«!.*?+•••  J 

The  matrix  of  system  (3)  coincides  with  matrix  A. 


(3) 


REDUCTION  TO  CANONICAL  FORM 


399 


§5] 


If  det  A  0,  then  (3)  has  a  unique  solution.  Then  the  hyper- 
surface  has  a  unique  centre.  Such  a  hypersurface  is  called  a  central 
hypersurface. 

If  det  A  —  0,  then  (3)  is  either  inconsistent,  in  which  case  there 
is  no  centre  (as,  say,  in  the  case  of  a  parabola)  or  it  is  consistent, 
and  then  there  are  infinitely  mafly  centres  (as  in  the  case  of  a  cir¬ 
cular  cylinder  or  a  pair  of  parallel  planes). 

4.  The  very  definition  of  a  centre  suggests  the  first  step  towards 
a  simplification  of  the  equation:  it  is  necessary  to  carry  the  origin 
to  the  centre. 

5.  We  introduce  two  symbols: 

6  —  deM,  A  =  detB 

The  criterion  of  a  central  surface  is  the  inequality  6  ¥=  0. 

§  5.  Reducing  to  canonical  form  the  general  equation  of  a  quadric 
hypersurface  in  Euclidean  space 

1.  Quadric  hypersurfaces  are  divided  into  several  classes  for 
which  we  obtain  distinct  elementary,  or  canonical,  forms  of  the 
equations. 

2.  In  the  theoretical  exposition  we  will  not  strive  to  save  on  ope¬ 
rations  and  will  begin  with  a  rotation  of  the  coordinate  system.  In 
practical  computations,  if  the  surface  is  central,  then  as  a  first  step 
it  is  best  to  carry  the  origin  to  the  centre. 

3.  We  will  consider  a  hypersurface  in  Euclidean  space  and  will 

make  use  solely  of  orthonormal  bases.  Suppose  we  have  the  general 
equation  2F(xt,  ...,  xn)—  0. .First  consider  the  self-adjoint  trans¬ 
formation  with  matrix  A,  which  is  the  matrix  of  a  quadratic  form  — 
the  group  of  higher-degree  terms.  All  roots  . X„  of  the  cha¬ 

racteristic  polynomial  of  this  transformation  are  real  and  there 
exists  an  orthonormal  basis  made  up  of  the  eigenvectors  e',  . . ., 
e'n,  and  to  the  eigenvector  e'k  corresponds  the  eigenvalue  k  = 
=  1,  . . . ,  n  (see  Chapter  IX,  Section  3). 

We  now  pass  to  this  basis  while  retaining  the  earlier  origin. 
Then  the  group  of  higher-degree  terms  assumes  the  canonical 
form 

'Laikxixk  =  Xl(x{f+  ...  +A,rt«)2 

and  the  left  member  of  the  equation  of  the  hypersurface  is  simpli¬ 
fied  to 

2 F  =  \(x\f  +  ...  +  X„«)2  +  2b\x[  +  ...  +2 byn  +  c 


400 


QUADRIC  HYPERSURFACES 


tCH  XI 


The  coefficients  of  the  first-degree  terms  have  changed  and  so  they 
are  primed.  The  constant  term  c  remains  unaltered. 


4.  We  now  consider  several  specific  cases. 

(1)  All  the  characteristic  roots  are  different  from  zero: 


\\  *¥=■  0,  ....  Xn= £  0 

It  is  then  necessary  to  isolate  perfect  squares: 


K  «)2  +  2bkx'k  —  *  (**  +  ) 


Then  translate  the  origin  via  the  formulas 


There  will  be  no  first-degree  terms  and  so  the  new  origin  is  the 
centre  of  the  surface.  We  obtain 


where 


+  •••  +K*2n=H 


(1) 


Equation  (I)  is  canonical. 

(2)  0,  . . . ,  K  =/=  0,  Xr+i  =  . . .  =  A,„  =  0;  r  is  the  rank  of 

the  quadratic  form  of  the  higher-degree  terms;  r  ^  n  —1. 

Here  the  situation  becomes  somewhat  complicated  and  in  order 
to  avoid  cumbersome  computations  it  is  necessary  to  carry  out  the 
transformation  of  the  form  of  the  higher-degree  terms  with  an 
aptly  chosen  plan  of  action. 

Let  us  first  of  all  find  the  eigenvectors  corresponding  to 
X|,  . . . ,  V  As  we  know,  they  may  be  chosen  so  as  to  form  an  ortho¬ 
normal  system  e\,  ....  e'r  (see  Section  3,  Chapter  IX).  The  other 
basis  vectors  will  be  determined  later. 

Suppose  we  have  the  original  equation  in  the  initial  coordinates: 

Z  aikxtxk  +  2  2  btXt  +  c  =  0 

Its  linear  part  is  uniquely  defined  by  specifying  the  vector 

b={bu  b2 . bn} 

Decompose  b  into  two  components,  one  of  which  lies  in  the  li¬ 
near  hull  e\ . e'r,  the  other  being  orthogonal  to  the  indicated 

linear  hull: 


=  ...  +P  re'r~P 


REDUCTION  TO  CANONICAL  FORM 


40! 


*5] 


To  do  this,  set 

fj,  =  (&,<),  p r  =  (b,e'r),  p  =  -b+£fiie'i 

The  vector  p  thus  constructed  ,Js  orthogonal  to  the  subspace 
L(e\>  •  •  •>  er )• 

If  p- jtG,  then  we  send  e'  along  the  vector  p,  and 

P  =  K 

where  p  is  a  numerical  coefficient. 

The  vector  e  is  taken  to  be  a  unit  vector.  It  can  be  directed  so 

n 

that  we  have  p  =  |p|  or  so  that  p  =  — |/?|. 

Take  the  vectors  e'+r  ....  e'n_i  so  that  together  with  the  ear¬ 
lier  constructed  vectors  they  form  an  orthonormal  system: 
e\,  e'v  ...,  e'r,  e'r+v  ....  e'n_r  e'n.  Otherwise  the  choice  of  the  vec¬ 
tors  e'+],  ...,  e'_,  is  arbitrary. 

If  p  —  8,  then  e'r+ . e'n  can  be  taken  at  pleasure  as  long 

as  the  system  e\,  ...,  e'n  is  orthonormal.  Note  that  in  this  case 
as  well  the  equation 

P  =  K 

holds  true,  but  p  =  |/?|  =  0. 

Thus 

b  =  $/{+  ...  -f- Pre'  —  pe'  (1) 

The  groups  of  second-degree,  first-degree  and  zero-degree  terms 
transform  separately  under  a  transition  to  a  new  basis. 

The  higher-degree  terms  in  the  new  basis  take  the  form 

=  •••  +MJfr  )2 

It  is  natural  to  write  the  group  of  first-degree  terms  as  a  scalar 
product: 

Z  bkxk  =  ( b ,  x) 

Due  to  (1) 

(b,  x)  =  (p,f>;  +  . . .  -f  P /r  ~  *)  =  P,<  +  ...  +  Pr<  —  \>.x'n 

since  in  an  orthonormal  basis  the  scalar  product  of  any  vector  by 
a  basis  vector  is  equal  to  the  corresponding  component  (coordi¬ 
nate):  ( e\ ,  x}  —  x'r  Thus,  after  changing  to  a  new  basis,  we  have 

2F  =  \(x\)2+  ...  +^(4)2  +  2p1<+  ...  +2pr.<-2p<  +  c  =  0 

The  constant  term  c  remains  unchanged. 


402 


QUADRIC  HYPERSURFACES 


ICH.  XI 


We  now  isolate  perfect  squares  for  k  =  1,  ....  r: 

h  (**)2  +  2P kxk  ~  K k  {_xk  +  P* 

Then  we  translate  the  origin  (only  in  the  direction  of  the  coordi¬ 
nate  axes  labelled  1,  . . . ,  r): 


1 

I* 

II 

V 

P. 

*1  ’ 

Xr+ 1  Xr+ 1 

X'r  =  Xr~ 

Pr 

K  ’ 

X'n  =  Xn 

The  equation  then  becomes 

Vi  +  •  •  -  +  Vr  —  2  =  H 

If  p.  =  0,  we  get  the  equation 

Vi  +  ■  ■  •  +  Vr  =  H  (°) 

which  is  canonical.  If  p  0,  then  we  write 
2px„  +  H  =  2fi  (x„  + 

Again  translate  the  origin,  this  time  along  the  axis  xn  by  the 
H 

amount  — 5—.  The  labels  of  the  coordinates  remain  unchanged  so 

as  not  to  complicate  notation. 

The  equation  becomes 

Vi  +  •••  +  KK  ~  2Vxn  =  0  (P) 

This  equation  is  canonical  too. 

No  other  cases,  except  those  considered  above,  are  possible.  It 
now  remains  to  list  and  classify  the  quadric  hypersurfaces. 

§  6.  Classification  of  quadric  hypersurfaces  in  Euclidean  space 

1.  On  the  basis  of  the  foregoing  simplification  of  equations,  hy¬ 
persurfaces  naturally  fall  into  the  following  classes: 

(1)  6  =  det  A  =£  0.  This  means  that  not  a  single  X,-  is  equal  to 
zero.  This  class  includes  all  cases  labelled  (I),  and  only  these 
cases.  We  have  the  canonical  equation 

V?+  •••  +K*n  =  H  (I) 

Here  and  henceforth  we  write  the  running  coordinates  without  any 
Additional  labels. 


8  6]  CLASSIFICATION  OF  QUADRIC  HYPERSURFACES  403 

(II)  6  =  det  A  =  0,  p  0,  r  =  rank  A  —  n  — 1.  We  have  the 
corresponding  canonical  equation 

...  — 2nx„  =  0  (II) 

(it  is  obtained  from  (P),  Section  for  r  —  n  —1). 

(I')6  =  0,  p  =  0.  (Since  6  =  0,  it  follows  that  r  <  n).  This 
class  includes  surfaces  with  canonical  equations  of  the  type  (a)  of 
Section  5,  or 

V'f  +  •  •  •  +  V;  =  H  (10 

where  1  s£C  r  ^  n  — 1. 

(II')  6  =  0,  n  ¥=  0,  r  <.  n  — 1.  This  class  includes  surfaces  with 
canonical  equations  of  type  (P),  Section  5,  for  r  <  n — 1,  that  is, 

V?+  •••  +  .V?-2^„=0  (110 

Here  1  r  n  — 2. 

The  foregoing  classes  exhaust  all  possibilities.  Cases  where  the 
equation  is  of  the  form  (I)  or  (II)  are  basic.  Cases  (I')  and  (IT) 
repeat  the  basic  cases,  with  the  sole  difference  of  being  in  a  sub¬ 
space  of  smaller  dimension. 

2.  Let  us  write  down  the  matrices  A  and  B  for  the  basic  cases. 
Case  I. 


Definition.  A  quadric  hypersurface  is  said  to  be  nondegenerate 
if  the  matrix  B  is  nonsingular,  that  is,  if 

A  =  det  B  =£  0 

It  is  clear  that  the  surfaces  (I)  are  nondegenerate  provided  that 
H  0  and  also  all  surfaces  (II)  as  well. 


404 


QUADRIC  HYPERSURFACES 


[CH.  XI 


3.  According  to  Subsection  4,  Section  3,  the  quantities  8=detA, 
A  =  det  B,  r  —  rank  A,  rank  B,  and  the  characteristic  polynomial 
p(X)  of  matrix  A  are  invariants  of  the  left  member  of  the  equation 
in  the  class  of  orthonormal  coordinate  systems.  All  these  quanti¬ 
ties  can  be  found  from  the  left-hand  member  of  the  general  equa¬ 
tion  of  a  quadric  hypersurface  specified  in  any  orthonormal  coordi¬ 
nates.  Besides,  we  know  the  equations  of  the  centre  in  any  coordi¬ 
nates. 

Therefore,  without  passing  to  the  canonical  equation  we  can 
determine  whether  a  hypersurface  is  central  or  not,  whether  it  is 
degenerate  or  not,  and  we  can  find  the  set  of  all  centres  and  com¬ 
pute  all  roots  Kj  of  the  characteristic  polynomial  of  matrix  A. 

Besides,  we  can  determine  H  for  a  hypersurface  of  type  (I).  In¬ 
deed,  from  Subsection  2  we  have 

-HXt  ...  Xn=^ 

Here  Xi,  . . .  Xn  —  6  ¥=  0,  whence 


For  a  hypersurface  of  type  (II)  we  have 

...  A 

But  in  the  case  Xn  =  0  the  product  Xi  . . .  Xn~i  taken  with  the 
minus  sign  is  equal  to  the  coefficient  p„_i  of  the  characteristic  po¬ 
lynomial  p(X).  Here  it  is  necessary  to  bear  in  mind  that  the  cha¬ 
racteristic  polynomial  is  written  as  indicated  in  Remark  2  of  Sub¬ 
section  2,  Section  3.  From  this  we  find 

t‘=±'\Aj^r  <‘> 

The  radicand  in  (I)  is  positive  since  the  existence  and  reality  of  n 
have  been  established  in  the  preceding  investigations. 

4.  For  nondegenerate  hypersurfaces  of  Case  I  we  have 

rank  A  —  n,  rankB  =  n-f-l  (2) 

From  the  first  equality  of  (2)  and  from  the  Kronecker-Capelli  theo¬ 
rem  applied  to  the  system  of  equations  of  the  centre  we  conclude 
that  all  of  them  are  central. 

Let  us  consider  them  in  more  detail. 


5.  If  A.i,  . . . ,  Xn  and  H  are  numbers  of  one  sign,  then  the  hyper¬ 
surface  (I)  is  said  to  be  an  (n  — 1) -dimensional  ellipsoid.  Its  equa¬ 
tion  can  be  rewritten  thus: 


4  + 


+ 


(3) 


§6] 


CLASSIFICATION  OF  QUADRIC  HYPERSURFACES 


405 


The  quantities  a*  are  called  the  semiaxes  of  the  ellipsoid  (a*  >  0). 
It  is  easy  to  verify  that  the  ellipsoid  is  located  in  a  parallelepiped 
defined  by  the  inequalities  |x,|  sg  at,  i  =  \,  . . . ,  n.  (For  n  =  3  see 
Fig  78,  for  n  =  2  see  Fig.  79).  Note  that  a  6-dimensional  ellipsoid 
for  At  =  1  is  an  ellipse  (Fig.  79);  for  6  =  0  it  is  a  pair  of  points 
xi  =  ±ax  (Fig.  80).  * 


It  is  difficult  to  give  a  pictorial  representation  of  ellipsoid  (3) 
for  n  >  3.  However,  a  comparison  of  Figs.  78,  79  and  80  will  help 
the  reader  to  picture  to  himself  how  a  6-dimensional  ellipsoid  be¬ 
comes  more  and  more  complicated  as  the  dimension  6  increases. 

-a,  0  a,  x, 

■■■  — -O  ■  I  —  '  O  * 


Fig.  80 


If  a,  =  ...=a„z=R,  the  ellipsoid  (3)  is  called  an  (n—l) -di¬ 
mensional  sphere  of  radius  R. 


6.  If  Xi,  . . . ,  X„  are  of  one  sign  and  H  is  of  another,  the  surface 
(I)  is  termed  an  imaginary  ellipsoid.  It  is  without  points  in  real 
space. 


7.  If  Xi . X„  are  of  different  signs  and  H  ^  0,  then  sur¬ 

face  (I)  is  called  a  hyperboloid.  In  this  case  it  can  be  reduced  to 
the  form 


+  ...  + 


f*+i 


4 


(4) 


by  dividing  both  members  of  (I)  by  H. 

The  quantities  a\ . a*  are  called  the  semitransverse  axes  and 

bi, . . . ,  b„-i,  are  called  the  semiconjugate  axes  of  the  hyperboloid 


406 


QUADRIC  HYPERSURFACES 


[CH.  Xt 


(4)  (a,  >  0,  bi  >  0).  Depending  on  the  signature  of  the  left  mem¬ 
ber  of  equation  (4),  hyperboloids  have  different  geometric  struc¬ 
ture.  In  elementary  analytic  geometry,  the  forms  of  surfaces  are 


Fig.  81 


investigated  by  considering  their  sections  by  different  planes  (see, 
for  instance,  Figs.  82,  83).  We  will  now  apply  the  same  procedure 


to  obtain  a  picture  of  the  structure  of  different  hyperboloids.  We 
consider  special  cases  beginning  with  familiar  objects  in  low-di¬ 
mensional  spaces: 

X'  X ^ 

(1)  For  n  —  2,  equation  (4)  takes  the  form  — - r=l  and 

ai  bi 

specifies  a  hyperbola  (Fig.  81). 


*  6] 


CLASSIFICATION  OF  QUADRIC  HYPERSURFACES 


407 


(2)  For  n  =  3  there  are  two  possibilities:  a  hyperboloid  of  two 
sheets  (Fig.  82) : 


x 


1 


•} 


X 


2 

2 


(5) 


« 

and  a  hyperboloid  of  one  sheet  (Fig.  83): 


=  1 


(3)  n  =  4.  Here  we  have  to  consider  three  cases: 

(a)  The  hyperboloid  of  two  sheets 


(6) 


like  the  hyperbola  and  the  hyperboloid  (5),  consists  of  two  separate 
parts  located  in  the  half-spaces  X\  ^  ax  and  X\  ^  — <i\.  With  hyper¬ 
planes  of  the  form  xt  =  constant,  for  |*i|>ai,  it  intersects  in 
ellipsoids  whose  semiaxes  increase  with  increasing  |*i|.  With  the 
remaining  hyperplanes  of  the  form  xt  =  constant  (t  =  2,  3,  4)  the 
hyperboloid  (6)  intersects  in  two-sheeted  hyperboloids  of  type  (5). 

(b)  The  equation 


specifies  a  new  type  of  hyperboloid  with  the  feature  that  with  every 
one  of  the  hyperplanes  of  type  x,  =  constant  it  intersects  either 
along  a  certain  hyperboloid  (one-sheeted  or  two-sheeted)  or  along 
a  cone.  (The  concept  of  a  cone  \Vas  introduced  in  Section  12,  Chap¬ 
ter  IV;  quadric  cones  are  discussed  in  more  detail  in  Subsection  9 
below.)  One  must  bear  in  mind  that  the  typical  section  here  is 
the  hyperboloid.  Cones  only  appear  in  separate  hyperplanes 
( j jd  |  =  ai,  | X2 1  =  CI2) ;  they  may  be  viewed  as  degenerate  hyper¬ 
boloids.  Three-dimensional  space  does  not  have  an  analogous  sur¬ 
face,  but  for  higher-dimensional  spaces  this  is  the  most  typical 
case. 

(c)  A  hyperboloid  similar  to  the  one-sheeted  hyperboloid: 


+  ^-  + 


a; 


=  1 


With  all  hyperplanes  x4  =  constant  it  intersects  along  two-dimen¬ 
sional  ellipsoids  whose  semiaxes  increase  with  increasing  |*4|; 
with  the  remaining  hyperplanes  =  constant  it  intersects  in  hy- 


408 


QUADRIC  HYPERSURFACES 


[CH.  XI 


perboloids  (one-sheeted  or  two-sheeted)  that  degenerate  into  cones 
if  X,  =  ±  a,. 

It  is  difficult  to  give  a  drawing  of  hyperboloids  in  four-dimen¬ 
sional  space.  However,  they  may  be  imagined  by  analogy  with  the 
lower-dimensional  cases  shown  in  Figs.  81  to  83.  Bear  in  mind 
that  sections  by  hyperplanes  have  the  form  of  the  surfaces  depicted 
in  Figs.  78,  82,  83,  and  25  (also  see  Figs.  77  and  88). 

In  the  general  case  we  have  the  following  possibilities: 

(a)  n  ^  2,  k  —  1  is  a  two-sheeted  hyperboloid  consisting  of  two 
parts  located  in  the  half-spaces  x\  ^  a i  and  xt  ^  —  a\.  The  hyper¬ 
planes  X\  =  constant  (|xi|>ai)  intersect  it  in  ( n  —  2)-dimen- 
sional  ellipsoids,  the  remaining  hyperplanes  x,  =  constant  (i  = 
=  2 ,...,«)  along  two-sheeted  hyperboloids. 

(b)  n  5s  4,  the  number  of  positive  and  also  the  number  of  ne¬ 
gative  terms  in  the  left  member  of  (5)  is  at  least  two.  Such  a 
hyperboloid  intersects  each  of  the  hyperplanes  X;  —  constant  along 
some  hyperboloid  of  smaller  dimension  (even  one  degenerating 
into  a  cone). 

(c)  n  ^  3,  k  —  n — 1  is  a  hyperboloid  similar  to  the  one-sheeted 
hyperboloid.  All  hyperplanes  xn  —  constant  intersect  it  along 
(n  —  2) -dimensional  ellipsoids,  the  remaining  hyperplanes  x,  = 
=  constant  (i  =  1,  ....  n  —  1),  along  hyperboloids  or  cones. 

It  is  easy  to  prove  (say,  by  induction  on  the  dimension  of  the 
space)  that  all  hyperboloids,  except  two-sheeted  ones,  contain  rec¬ 
tilinear  generators. 

Also  note  that  every  ^-dimensional  plane  of  the  type  xh+\  — 
—  ck+ 1,  ....  xn  —  cn  ( c{  =  constant)  intersects  hyperboloid  (4) 
in  a  (k—  l)-dimensional  ellipsoid,  and  among  the  {n  —  ^-dimen¬ 
sional  planes  of  the  type  X|  =  Ci,  ....  xh  —  ch  (c,  =  constant) 
there  are  such  that  intersect  hyperboloid  (4)  in  ( n  —  k — l ) -di¬ 
mensional  ellipsoids.  It  can  be  proved  that  on  the  hyperboloid  (4) 
there  are  r-dimensional  ellipsoids  of  all  possible  dimensions 
r  ^  max  (k  —  \,n  —  k—\)  and  there  are  no  ellipsoids  of  higher 
dimension.  For  n  —  2,  3  this  is  evident  from  a  comparison  of 
Figs.  79  to  83.  We  will  not  consider  the  general  case. 

If  in  addition  to  the  given  Euclidean  metric  we  introduce  into 
the  space  another  quadratic  metric  with  an  alternating  quadratic 
form,  then  the  role  of  spheres  there  will  be  played  by  hyperboloids. 
In  this  connection,  the  two-sheeted  hyperboloid  in  four-dimensional 
space  plays  an  important  part  in  the  theory  of  relativity. 

8.  All  surfaces  with  canonical  equations  of  type  (II)  are  non¬ 
degenerate  and  are  called  paraboloids. 

Every  one  of  the  paraboloids  is  devoid  of  any  centre.  By  the 
Kronecker-Capelli  theorem  the  system  of  equations  of  a  centre  in 
the  case  of  paraboloids  is  inconsistent,  since  the  rank  of  the  basic 
matrix  A  is  equal  to  n —  1  and  the  rank  of  the  augmented  matrix 


*6) 


CLASSIFICATION  OF  QUADRIC  HYPERSURFACES 


409 


is  equal  to  n  (see  matrices  A  and  B  which  are  described  in  detail 
in  Subsection  2). 

There  are  many  different  types  of  paraboloids  due  to  the  different 
combinations  of  signs  of  X,-.  They  can  be  investigated  as  is  done  in 
the  preceding  subsection. 

9.  We  now  consider  degenerate  surfaces,  that  is,  those  for 
which  A  =  det  B  —  0. 

They  are  conveniently  classified  into  three  groups. 

(1)  Case  (1),  provided  that  H  =  0.  Then  rank  A  =  rank  B=  n 
and  the  hypersurface  is  central.  The  equation  is  of  the  form 

V?+  +  V«=°  (7) 


We  know  that  a  homogeneous  equation  of  the  second  degree  spe¬ 
cifies  a  cone  (see  Section  12,  Chapter  IV).  If  all  are  of  the 

same  sign,  then  the  cone  is  imaginary.  (Bear  in  mind  that  it  has 
one  real  point,  and  that  is  the  centre.) 

If  the  Xi  are  of  different  signs,  the  cone  is  called  a  real  cone  in 
the  sense  that  it  has  real  points  besides  the  centre.  By  renumber¬ 
ing  the  coordinates  and  changing  the  sign  of  the  left  member  of 
equation  (7)  it  is  possible  to  achieve  A.|  >  0,  Xn  <  0  and  see  that 
the  number  of  negative  terms  does  not  exceed  the  number  of  posi¬ 
tive  ones.  Then  (7)  reduces  to 


x2  x2  X2 

xl  I  I  xk  xk+\ 

o  I  •  •  •  T"  •>  ,2 

a;  ak  b\ 


u n-k 


=  0 


(8) 


where  The  hyperplane  xn  =  constant  ^  0  intersects  cone 

(8)  in  an  (n  —  2)-dimensional  .ellipsoid  if  k  =  n  —  1,  or  in  a 
hyperboloid  if  k  <  n — 1  (this  latter  case  is  only  possible  for 
n  ^  4).  It  is  easy  to  verify  that  the  cone  (8)  consists  of  all 
possible  straight  lines  passing  through  the  origin  and  through 
points  of  the  surface  in  which  it  intersects  the  hyperplane  xn  — 
=  constant  ^  0. 

(2)  Consider  equation  (T).  Let  E*  be  a  subspace  of  dimension  r 

spanning  the  basis  vectors  . . .  In  this  subspace,  equation 

(I')  defines  a  hyperplane  of  type  (I),  which  we  denote  by  5. 

In  the  case  of  any  dimension,  equation  (I')  defines  a  hypersur¬ 
face  called  a  cylinder.  Here  is  what  we  have  in  detail. 

Denote  by  E**  the  orthogonal  complement  of  subspace  E* 
(E**  =  L(er+ 1,  ....  e„)).  Take  an  arbitrary  vector  a  in  E**  and 
translate  the  hypersurface  (T)  by  the  vector  a.  Then  in  the  case 
of  each  point  only  those  coordinates  will  change  that  do  not  appear 
in  (I')>  that  is,  xn+l,  . . . ,  xn.  Therefore  the  points  obtained  in  any 


410 


QUADRIC  HYPERSURFACES 


[CH.  XI 


such  displacement  also  satisfy  equation  (I').  Thus,  the  hypersur¬ 
face  (F)  is  obtained  by  a  parallel  transfer  of  S  in  all  possible  di¬ 
rections  in  subspace  £**.  This  construction  is  a  multidimensional 
generalization  of  how  the  cylinder  x2  -f  y2  =  1  in  £3  is  obtained 
via  a  parallel  displacement  of  a  circle  along  the  2-axis  ( x3 ) 
(Fig.  84).  The  role  of  rectilinear  generators  on  cylinder  (I')  is 
played  by  (n  —  r)-dimensional  planes  parallel  to  E**. 

All  the  cylinders  (!')  have  infinitely  many  centres.  It  is  easy  to 
verify  that  the  collection  of  all  centres  coincides  with  the  sub¬ 
space  £**. 


Fig.  84 


(3)  Equations  of  type  (IF)  determine  surfaces  called  parabolic 
cylinders.  Denote  by  E'  the  (r -f-  1) -dimensional  subspace  spanned 
by  the  vectors  eu  . . . ,  er,  en  and  its  orthogonal  complement  by  E". 
In  the  subspace  E'  the  equation  (IF)  determines  a  paraboloid, 
which  we  denote  by  S,  as  before.  The  hypersurface  (IF)  is  formed 
by  a  parallel  displacement  of  the  paraboloid  5  onto  all  possible 
vectors  of  the  subspace  E".  Parabolic  cylinders  do  not  have 
centres. 

§  7.  Affine  transformations 

1.  We  assume  a  system  of  affine  coordinates  has  been  introduced 
in  an  affine  space  2l„  and  we  consider  the  transformation,  specified 
by  the  following  formulas,  of  an  arbitrary  point  M( xu  ....  xn) 
into  the  point  M' (x[,  *'): 

+  ...  + 


Xn  =  anlX  1+  +  annXn  +  bn 


(1) 


AFFINE  TRANSFORMATIONS 


411 


*  n 

We  assume  that  the  n  X  n  matrix  A  =  ||  a ||  is  nonsingular,  that 
is,  that  det  A  0. 

Such  a  transformation  of  9l„  is  termed  an  affine  transformation. 
Since  the  matrix  A  is  nonsingular,  the  affine  transformation  is 
one-to-one.  • 

It  is  readily  verified  that  if  two  formulas  of  type  (1)  differ  even 
by  a  single  coefficient,  then  the  affine  transformations  that  they 
specify  in  some  system  of  affine  coordinates  are  distinct  (in  the 
sense  of  Subsection  1,  Section  2,  Chapter  VI). 

2.  The  definition  of  the  class  of  affine  transformations  is  inva¬ 
riant  with  respect  to  a  choice  of  affine  coordinates. 

True  enough,  for  if  we  pass  to  other  affine  coordinates,  the  old 
coordinates  of  a  point  M  and  its  image  M'  will  be  expressed  in 
terms  of  the  new  coordinates  by  formulas  of  the  first  degree,  and 
all  relations  that  will  be  involved  are  uniquely  reversible.  There¬ 
fore  when  passing  to  other  affine  coordinates  we  will  not  go  out¬ 
side  the  class  of  uniquely  reversible  formulas  of  the  type  (1). 

3.  Suppose  we  have,  in  affine  space,  a  geometric  figure  si  spe¬ 
cified  by  an  equation  of  the  type 

F(x)  =  0  (2) 

where  the  symbol  x  denotes  the  coordinates  of  the  running  point. 
Let  there  be  given  an  affine  transformation  which  we  denote  sym¬ 
bolically  as 

x'  =  q>  (x)  (3) 

We  seek  the  equation  of  the  image  si'  of  the  figure  si.  From  (3) 
we  find  that  x  =  cp~'(x').  Substituting  this  expression  into  (2),  we 
get  the  equation 

F(<p- '(*'))  =  ()  (4) 

which  is  satisfied  by  all  points  of  the  figure  si'.  Because  of  the 
one-to-oneness  of  the  transformation  (3)  there  are  no  extra  points 
(points  not  belonging  to  si'  do  not  satisfy  (4)). 

We  do  not  change  the  coordinate  system  in  space  and  so  for  the 
coordinates  of  the  running  point  it  is  convenient  to  retain  the 
original  symbol  x  (and  not  x',  as  in  (4)).  Finally,  for  si'  we  get 
the  equation 

F(qp-1  (a))  =  0 


Remark.  Actually,  the  only  thing  used  is  the  one-to-one  nature 
of  the  transformation  (3).  We  will  therefore  make  use  of  the 
results  of  this  subsection  in  Chapter  XII  when  considering  trans¬ 
formations  that  are  more  general  than  affine  transformations. 


412 


QUADRIC  HYPERSURFACES 


[CH.  XT 


4.  Affine  transformations  preserve  the  degree  of  the  algebraic 
equations;  namely,  if  the  coordinates  of  the  point  M  satisfy  an  al¬ 
gebraic  equation  of  degree  k,  then  the  coordinates  of  the  point  M' 
satisfy  an  equation  of  the  same  degree. 

Proof.  Since  (1)  are  of  the  first  degree,  no  term  of  the  equa¬ 
tion  can  increase  its  degree  as  a  result  of  the  transformation. 
Neither  can  there  be  a  reduction  in  the  degree  because  otherwise 
we  would  have  a  rise  in  the  degree  during  the  reverse  transforma¬ 
tion. 

Corollary.  Under  an  affine  transformation,  a  hypersurface  is 
transformed  into  a  hypersurface. 

5.  Theorem  I.  Under  an  affine  transformation,  any  plane  of  di¬ 
mension  k  is  carried  into  a  plane  of  the  same  dimension. 

Proof.  Let  a  ^-dimensional  plane  Ph  be  specified  by  a  linear 
system  of  rank  n  —  k  containing  n  —  k  equations.  Writing  this  sy¬ 
stem  in  matrix  form,  we  have 

Sx  —  s 


where  S  in  an  (n  —  k)Y,n  matrix  and  s  denotes  the  column 
matrix  of  constant  terms.  We  also  write  down  the  affine  transfor¬ 
mation  (1)  in  matrix  form: 

x'  =  Ax  +  b 

Using  Subsection  2,  we  find  the  matrix  equation  of  the  image  P'k 
of  the  plane  Ph: 

SA~'x=s  (5) 

where  s  =  s  SA~lb  is  a  new  column  of  constant  terms.  System 
(5)  also  contains  n  —  k  equations  in  the  coordinates  of  the  running 
point  x.  By  Subsection  3,  Section  4,  Chapter  II,  and  due  to  the 
nonsingularity  of  A,  we  have 

rank  SA-1  =  rank  S  =  n  —  k 

Hence,  system  (5)  is  consistent  and  determines  a  plane  of  the 
same  dimension  k,  which  was  what  we  set  out  to  prove. 

6.  It  is  clear  that  affine  transformations  preserve  parallelism  of 
hyperplanes,  for  if  prior  to  the  transformation  two  hyperplanes  did 
not  intersect,  then  neither  will  their  images  intersect;  this  is  due 
to  the  one-to-oneness  of  the  transformation,  whence  follows  the  pa¬ 
rallelism  of  their  images  (see  Section  6,  Chapter  III). 

7.  The  following  general  assertion  holds  true. 

An  affine  transformation  preserves  parallelism  of  planes  of  any 
dimension.  The  proof  is  left  to  the  reader. 


AFFINE  TRANSFORMATIONS 


413 


*  7J 


8.  Theorem  2.  An  affine  transformation  in  n-dimensional  affine 
space  51  „  is  uniquely  defined  if  for  the  inverse  images  there  is  spe¬ 
cified  an  arbitrary  ordered  system  of  n  \  points  M0,  M,,  M„ 
in  the  general  position,  and  if  for  their  images  we  have  an  ar¬ 
bitrary  analogous  system  N0,  N\,  «.  . ,  N„. 

Proof.  Introduce  in  51  „  an  affine  coordinate  system  with  Mo  as 
origin  and  with  the  vectors  Af0Af . M»M„  as  basis.  (These  vec¬ 

tors  are  independent  since  the  points  Mn,  M\,  ....  Mn  are  in  the 
general  position;  see  Subsection  5,  Section  3,  Chapter  III.)  In  these 
coordinates,  the  desired  affine  transformation  is  given  by  formulas 
of  type  (1),  in  which  the  column  of  constant  terms  consists  of  the 
coordinates  of  the  point  iV0  and  the  column  of  coefficients  of  xt 
consists  of  the  coordinates  of  the  vector  N0i V,-.  The  condition 
det  A  =£  0  is  fulfilled  because  the  points  Nj  are  in  the  general  po¬ 
sition.  Thus,  the  desired  transformation  exists  and  is  unique. 


9.  Consider  two  transformations: 

(1)  The  linear  transformation 

x\  =  0,1*1  -f  •  •  •  +  alnxn, 

Xn  CLn\X\  "4*  •  ■  •  “l-  Grinin 

(2)  The  parallel  transfer 

1 

x'=xn  +  bn  J 

n  n  1  n  j 


(6) 


(7) 


Under  parallel  transfer,  all  points  are  displaced  simultaneously 
by  one  and  the  same  vector  {b i,  . : . ,  bn). 

Every  transformation  of  type  (1)  is  a  composition  of  the  trans¬ 
formations  (6)  and  (7),  and  conversely. 

For  this  reason,  every  affine  transformation  is  a  composition  of 
the  linear  transformation  (6),  provided  that  it  is  nonsingular,  and 
the  parallel  transfer  (7). 


10.  Now  consider  Euclidean  space.  By  the  preceding  subsection 
and  Section  11  of  Chapter  IX,  every  affine  transformation  can  be 
represented  as  a  composition  of  a  self-adjoint  transformation,  an 
isometric  transformation,  and  a  parallel  transfer. 

We  can  speak  of  an  isometric  transformation  (or,  briefly,  iso¬ 
metry)  in  a  somewhat  broader  sense  than  in  Chapter  IX,  namely, 
we  can  consider  it  not  as  a  linear  transformation,  but  as  a  trans¬ 
formation  of  type  (1)  that  preserves  distance  between  points.  Then 
both  linear  transformations  and  parallel,  transfer?  are  isometries, 


414 


QUADRIC  HYPERSURFACES 


[CH.  XI 


and  their  composition  is  an  isometry.  For  this  reason,  an  affine 
transformation  is  a  product  of  an  isometry  and  of  contractions 
along  n  orthogonal  directions. 

11.  It  is  important  to  point  out  that  all  affine  transformations 
of  the  space  constitute  a  group,  which  is  called  the  affine  group 
of  the  space  2ln. 

To  prove  this,  it  is  sufficient,  by  Section  2  of  Chapter  VI,  to 
verify  that: 

(1)  an  affine  transformation  is  invertible  and  the  inverse  is  also 
an  affine  transformation; 

(2)  the  product  of  two  affine  transformations  is  an  affine  trans¬ 
formation. 

Both  properties  obviously  follow  from  Subsection  1. 

12.  Definition.  Two  figures  si  and  si'  in  affine  space  are  said  to 
be  affinely  equivalent  if  one  of  them  is  an  image  of  the  other  under 
some  affine  transformation. 

Since  affine  transformations  constitute  a  group,  we  have  the 
following  properties  of  the  affine  equivalence  of  figures: 

(1)  if  si  is  equivalent  to  si',  then  si'  is  equivalent  to  sl\ 

(2)  if  si  is  equivalent  to  si',  and  si'  is  equivalent  to  si ",  then 
si  is  equivalent  to  si" . 

(3)  every  figure  is  equivalent  to  itself. 

Example.  Every  ellipse  on  a  two-dimensional  Euclidean  plane  is 
affinely  equivalent  to  the  unit  circle. 

It  is  left  to  the  reader  to  prove  that  in  the  space  any  two 
planes  of  the  same  dimension  k  (l  k  n — 1)  are  affinely 
equivalent. 

§  8.  Affine  classification  of  quadric  hypersurfaces 

1.  We  have  established  that  in  the  affine  space  2ln  all  quadric 
hypersurfaces  are  distributed  into  a  finite  number  of  classes  so 
that  within  one  class  all  surfaces  are  affinely  equivalent  to  one 
another.  This  distribution  into  classes  is  called  the  affine  classifi¬ 
cation  of  quadric  hypersurfaces.  (One  also  speaks  of  a  classifica¬ 
tion  relative  to  the  affine  group.) 

2.  Take  an  arbitrary  quadric  hypersurface  and  reduce  its  equa¬ 
tion  to  canonical  form.  Algebraically,  this  means  that  we  trans¬ 
form  the  left  member  of  the  equation  via  certain  formulas  of  type 
(1),  Section  7.  If  we  regard  these  formulas  not  as  formulas  for  the 
transformation  of  coordinates,  but  as  an  affine  transformation, 
then  the  resulting  canonical  equation  yields  a  hypersurface  that 
is  affinely  equivalent  to  the  original  one.  If  one  more  affine  trans- 


§  9]  INTERSECTION  OF  STRAIGHT  LINE  WITH  HYPERSURFACE  4J3 

formation  is  carried  out,  namely  a  contraction  along  the  directions 
of  the  coordinate  axes,  then  all  nonzero  coefficients  can  be  reduced 
to  +1  or  —1  in  any  canonical  equation. 

For  this  reason,  in  cases  (I)  and  (F),  Section  6,  we  obtain,  for 

H  #  0,  » 

±*2  -±x*  ±  . . .  ±  =  1  (l<r</z)  (1) 

in  cases  (I)  and  (F),  Section  6,  we  obtain,  for  H  =  0, 

±  x*  ±  4  ±  . . .  ±  x*  =  0  (1  <  r  <  n)  (2) 

and  in  cases  (II)  and  (IF)  we  get 

±x*±  ...  ±x2r- 2xn  =  0  (l<r<n-l)  (3) 

Surfaces  specified  by  distinct  equations  of  type  (l),  (2)  and  (3) 
cannot  be  carried  into  one  another  via  an  affine  transformation 
because  of  the  law  of  inertia  of  quadratic  forms.  Distinct  equations 
here  are  those  which  cannot  be  carried  one  into  another  by  multi¬ 
plication  by  ( — 1)  and  renumbering  the  coordinates. 

We  have  thus  obtained  the  desired  classes  of  affinely  equivalent 
hypersurfaces,  each  of  which  has  its  representative  among  the 
equations  (1),  (2),  (3). 

§  9.  The  intersection  of  a  straight  line  with  a  quadric 
hypersurface.  Asymptotic  directions 

1.  Given  a  hypersurface 

2F  =  'Zlaikx,xk  +  2'Zbkxk  +  c  =  0  (1) 

and  an  arbitrary  point  M0  with  coordinates  (*®,  ,v®).  Draw  a 

straight  line  through  M0  in  the  direction  of  a  vector  /  = 

=  {h,  •  •  • .  M- 

We  now  seek  the  points  of  intersection  of  the  straight  line  and 
the  hypersurface  (1). 

The  coordinates  of  the  running  point  Af  on  the  indicated  line 
are  given  by  the  equations 

xk==x°k  +  xlk,  —  OO  <  T  <  oo  (2) 

To  find  the  points  of  intersection  it  is  necessary  to  solve  simul¬ 
taneously  the  equations  (1)  and  (2).  Putting  (2)  in  (1),  we  get 

’’  Z  (2  aax%  +  2  V*)  +  2/-'  M . *9-0  (3) 

It  is  necessary  to  investigate  equation  (3). 

2.  If  Z  <Wr4  ¥*0,  then  (3)  is  a  quadratic  equation.  In  this 
case  there  are  two  points  of  intersection;  these  points  may  be  two 


416 


QUADRIC  HYPERSURFACES 


[C.  XI 


distinct  real  points,  two  distinct  complex  conjugate  points,  and, 
finally,  coincident  points.  In  the  latter  case  we  say  that  the 
straight  line  has  a  multiple  point  of  intersection  with  the  surface. 

Example.  On  the  Euclidean  plane,  the  circle  x2  -j-  y2  —  1  and 
the  straight  line  x  —  — 2  do  not  have  real  points  of  intersection 
(Fig.  85).  A  simple  computation  yields  the  coordinales_of  the  com¬ 
plex  conjugate  points  of  their  intersection  (—2,  ±  i  V^). 

In  order  to  see  these  points,  consider  a  circle  x2  -f-  y2  =  1  on 
a  two-dimensional  complex  plane.  For  a  model  of  a  two-dimen- 


Flff.  85 


sional  complex  plane  we  take  four-dimensional  real  space  (see  Sec¬ 
tion  11,  Chapter  I).  Set 

x  =  u  -f  is,  y  =  v  -f-  it)  (4) 

Putting  these  expressions  into  the  equation  of  the  circle  and  sepa¬ 
rating  the  real  and  imaginary  parts,  we  get 

u2+v2-l2-x?=\,  I 

«£  +  uq  =  0  J  (5) 

System  (5)  shows  that  the  circle  x2  +  y2  =  1,  considered  on  the 
two-dimensional  complex  plane  (4),  is  depicted  in  the  four-dimen¬ 
sional  space  of  the  variables  ( u ,  v,  |,  q)  as  the  intersection  of  a 
hyperboloid  and  a  cone. 

The  straight  line  x  =  — 2  is  depicted  in  the  space  of  variables 
(«,  v,  |,  q)  as  a  two-dimensional  plane: 

u  =  — 2,  1  =  0  (6) 

Consider  in  four-dimensional  space  the  three-dimensional  sub¬ 
space  |  =  0.  The  plane  (6)  is  completely  contained  in  it,  while  the 
circle  (5)  intersects  this  subspace  in  the  figure 

u2  +  v2- q2=l,  | 
qu  =  0  J 

consisting  of  the  ordinary  circle 

u2  +  v2  =  1 ,  q  =  0 


(7) 


$$)  IMTgRSfeCTlON  Ofi  StRAtr.Ht  UNfe  WITH  HYRERSURFACE  4l? 

(it  is  precisely  this  circle  that  we  see  on  the  real  Euclidean  plane) 
and  of  the  hyperbola 

u2  —  rj2  =  1 ,  o  =  0 

(Fig.  86).  The  plane  (6)  and  t^ejigure  (7)  intersect  in  the  same 
points  u  = — 2,o  =  0,  q=±y3  that  have  just  been  under  dis¬ 
cussion. 

3.  If 

52  =  0  (8) 

then  for  (3)  we  will  have  either  a  first-degree  equation,  or  a  con¬ 
tradictory  equation,  or  an  identity.  In  the  first  of  these  three  cases, 


we  say  that  the  straight  line  intersects  the  hypersurface  once  in 
a  finite  point  and  the  next  time  at  infinity.  In  the  second  case,  we 
say  that  the  line  has  a  double-  intersection  at  infinity  with  the 
hypersurface.  In  the  third  case,  the  line  lies  entirely  on  the  hyper- 
surface.  In  all  three  cases,  we  say  that  the  line  has  an  asymptotic 
direction  relative  to  the  given  hypersurface.  The  asymptotic  direc¬ 
tion  is  given  by  the  vector  /  =  {/ 1,  ...,  /„},  provided  we  have  (8). 

All  straight  lines  having  asymptotic  directions  and  passing 
through  a  single  point  form  a  cone  (Fig.  87).  From  (2)  and  (8)  we 
obtain  the  equation  of  the  cone  of  asymptotic  directions,  the  vertex 
of  which  is  located  at  the  point  M0: 

Zaik(xl-x°)(xk~xl)  =  0 

From  the  equations  of  the  centre  it  follows  that  if  M0  lies  at  the 
centre  of  a  hypersurface  and  the  vector  /  has  an  asymptotic  direc¬ 
tion,  then  equation  (3)  assumes  the  form 

0  •  t2  -f  0  •  t  -f  2F(a.^,  ....  x2)  =  0 


14-661 


418 


QUADRIC  HYPERSURFACES 


[CH.  XI 


Then  if  f  (x°,  0,  it  follows  that  the  straight  lines 

forming  this  cone  do  not  meet  the  hypersurface  at  a  single  finite 


Fig.  87 

point.  These  straight  lines  may  be  called  asymptotic  lines,  and  the 
cone  an  asymptotic  cone. 


(a)  (b) 


Instances  are  the  asymptotic  cones  of  hyperboloids  in  three-di¬ 
mensional  space  (Fig.  88)  and  the  asymptotes  of  a  hyperbola  for 
n  =  2. 

§  10.  Conjugate  directions 

I.  Suppose  a  vector  /  has  a  nonasymptotic  direction.  Then  any 
straight  line  passing  in  the  direction  of  /  intersects  the  hypersur¬ 
face  in  two  points  M i  and  M2.  Let  Mo  be  the  midpoint  of  the  chord 


CONJUGATE  DIRECTIONS 


419 


5  io] 

M|M2.  For  the  definition  of  the  midpoint  of  a  line  segment  in  the 
case  of  a  real  affine  space  see  Subsection  3,  Section  8,  Chapter  III. 
If  Mi,  M2  are  complex  conjugate  points,  then  the  midpoint  M0  of 
the  chord  M)M2  is  to  he  understood  as  the  point  whose  coordinates 
are  the  arithmetic  means  of  the  coordinates  of  the  endpoints.  M0  is 
a  real  point  in  this  case  as  well. 

Consider  all  straight  lines  parallel  to  the  vector  l  and  on  each 
of  them  find  the  midpoint  of  the  chord  MtM2.  It  turns  out  that  the 
locus  of  these  midpoints  is  a  hypersurface. 

We  now  prove  this.  We  have 


—  M0M2  =  t2/  (1) 

where  t2  =  —  tj.  If  M,  and  M2  are  real  points,  then  the  equality 
t2  =  — ti  is  evident  geometrically.  If  Mi,  M2  are  complex  conju¬ 
gate  points,  then  in  place  of  each  of  the  vector  equations  (1)  we 
can  write  n  coordinate  equations.  From  them  too  it  will  follow  im¬ 
mediately  that  t2  =  —  ti. 

Thus 

ti  +  t2  =  0  (2) 

Let  us  go  back  to  equation  (3)  of  Section  9.  Since  the  straight 
line  has  a  nonasymptotic  direction,  the  coefficient  of  the  square  of 
the  unknown  is  nonzero.  By  Vieta’s  theorem  (the  sum  and  pro¬ 
duct  formulas  of  Vieta)  and  because  of  (2),  the  coefficient  of  the 
first  power  of  the  unknown  in  equation  (3),  Section  9,  must  vanish. 
Therefore 

E«A  +  Zv.-o  (3) 

• 

To  obtain  an  equation  for  all  midpoints,  we  have  to  assume 
that  M0  is  any  midpoint  and  then  consider  its  coordinates  as  the 
running  coordinates  (xlt  ....  x„).  Then  from  (3)  we  have 


Put 


=  0 


Ni  —  Yj<iikh<  D  —  Yjbklh 


(4) 


Then  (4)  becomes 


N\Xx  +  . ..  N  nxn D  —  0  (5) 


It  is  easy  to  prove  that  there  are  nonzero  numbers  among  the  N{. 
Indeed,  suppose 


(6) 


14* 


Ni  —  Yj  alk.lk  —  Q 


420 


QUADRIC  HYPERSURFACES 


[CH.  XI 


for  all  n.  Multiplying  equations  (6)  by  /4  and  adding, 

we  obtain 

contrary  to  the  hypothesis  made  at  the  beginning  of  the  section. 

Since  there  are  nonzero  numbers  among  the  numbers  Ni, ,  Nn, 
(5)  is  the  equation  of  a  hypersurface. 

2.  Hyperplane  (5)  is  called  the  diametral  hyperplane  conjugate 
to  the  direction  l  with  respect  to  the  given  hypersurface. 

It  bisects  every  chord  parallel  to  /. 

3.  The  numbers  Nu  ....  Nn  form  the  coordinates  of  a  vector 

N  =  {iV, . Nn}. 


We  assume  the  coordinates  in  space  to  be  orthonormal.  Then 
vector  N  is  orthogonal  to  the  diametral  hyperplane  (5),  that  is  to 
say,  it  is  its  normal  vector. 

The  relations 


—  (’■ah  n) 

may  be  regarded  as  the  linear  transformation 

N  =  At 


which  carries  vector  /  into  vector  N  (Fig.  89). 

This  is  precisely  the  self-adjoint  linear  transformation  that 
engaged  us  when  we  investigated  the  general  equation  of  a  quadric 
hypersurface. 

4.  We  know  that  in  the  process  of  reducing  the  equation  of  a 
hypersurface  to  canonical  form  it  is  necessary  to  direct  the  coor¬ 
dinate  axes  along  the  eigenvectors  of  the  transformation  A. 


CONJUGATE  DIRECTIONS 


421 


§  10) 

It  is  now  easy  to  see  why,  for  geometric  reasons,  it  is  precisely 
these  directions  that  are  advantageous  for  simplifying  the  equa¬ 
tions  of  a  hypersurface.  * 

Suppose  for  the  sake  of  simplicity  we  are  considering  a  proper 
direction  that  is  not  asymptotic.  Then  the  conjugate  diametral  hy¬ 
perplane  exists  and  is  perpendicular  to  this  direction.  It  is  there¬ 
fore  the  plane  of  symmetry  of  the  given  hypersurface. 

From  this,  at  least  in  the  case  of  a  nondegenerate  central  hyper¬ 
surface,  it  is  clear  that  by  reducing  it  to  canonical  form  we  take 
for  the  coordinate  planes  the  orthogonal  system  of  its  planes  of 
symmetry. 


Chapter  XII 


PROJECTIVE  SPACE 


§  1.  Homogeneous  coordinates  in  affine  space.  Points  at  infinity 


1.  We  consider  the  n-dimensional  affine  space  In  it  is  given 
a  system  of  affine  coordinates  with  origin  O  and  basis  elt  . . . ,  en. 
Let  X\ . xn  be  the  coordinates  of  an  arbitrary  point  M. 

We  now  introduce  into  5ln  the  so-called  homogeneous  coordi¬ 
nates. 

We  say  that  h . £n+i  (£n+i  ¥=  0)  are  the  homogeneous 

coordinates  of  a  point  M  (jci,  . . . ,  xn)  if 


li 

ln  + 1 


=  *l» 


•”  ln  + 1 


(1) 


It  is  clear  that  M  is  defined  by  the  numbers  (£i,  ....  £„,  g„+i)- 
We  will  write  M  (|i,  ....  £„+i).  It  is  also  clear,  in  turn,  that  Af 
does  not  fully  determine  its  homogeneous  coordinates.  Indeed,  if 
we  multiply  all  the  homogeneous  coordinates  by  one  and  the  same 
nonzero  number,  the  point  will  not  change.  In  other  words,  the 
set  of  numbers  (gi,  ....  |n+i)  and  the  set  of  numbers  (Xgi,  ..., 
?.£„+[),  for  X  =7^=  0,  determine  one  and  the  same  point. 


2.  By  what  has  been  said,  homogeneous  coordinates  of  an  ar¬ 
bitrarily  taken  point  depend  on  the  choice  of  the  affine  system  of 
coordinates,  that  is,  on  the  choice  of  the  origin  0  and  the  basis 

e . .  tv  Let  us  change  to  a  new  origin  and  a  new  basis.  Then 

the  affine  (nonhomogeneous)  coordinates  will  change  via  formulas 
of  the  form 

x'\  —  Qi  \X\  4-  ...  -f-  QinXn  +  Qi  n+l 


x'n — Qnl.Vl  +  ...  +  QnnXn  +  Qn  n  +  l 

where  the  constant  terms  Qi  n+!,  . . . ,  Q„  „+i  are  the  new  coordi¬ 
nates  of  the  old  origin  0;  the  matrix  1|  Qi}  )|  (i,  /  =  1 . n)  is 

determined  in  familiar  fashion  from  the  old  and  new  bases  (see 
Section  5,  Chapter  II).  As  we  see,  formulas  (2)  are  generally  not 
homogeneous. 


HOMOGENEOUS  COORDINATES  IN  AFFINE  SPACE 


423 


§  1) 


From  (i)  and  (2)  we  get  appropriate  formulas  for  the  transfor¬ 
mation  of  homogeneous  coordinates: 

K  =Qn!i+  •••  +  Q1„4„:3-Ql„+1in+1,  'i 


K  ==Q«i^+  •••  +Q;m£« +  <2„«+i£«+i> 

^n+l  ^n-H 


(3) 


where  the  last  equation  |'  +  1  =  gn+|  is  taken  at  pleasure.  We  could 
have  written 


=  Qii^i  +  ••• 

"I-  ^in^rt  Q|  „+|ln+|.  >| 

K 

-Q.I6.+  ••• 

Qnn&n  "d”  Qri  n+iln+I’  | 
Ih  +  I  ' 

^  (4) 

where  A,  is  any  nonzero  scalar. 

We  see  that  the  formulas  for  transformation  of  homogeneous 
coordinates  are  themselves  homogeneous,  that  is  to  say,  they  do 
not  have  constant  terms. 


3.  Let  us  consider  an  arbitrary  hyperplane  in  9l„: 

4i*i  +  ...  +  Anxn  -f  An+i  —  0 
Passing  to  homogeneous  coordinates,  we  get  the  equation 

'dili  +  •  •  •  +  An  ln  +  /ln+il„+i  —  0  (5) 

Thus,  in  homogeneous  coordinates  a  hyperplane  is  determined 
by  a  first-degree  homogeneous  equation.  Accordingly,  a  plane  of 
any  dimension  k  is  determined  by  a  system  of  homogeneous  linear 
equations  of  rank  r  =  n  —  k. 


4.  We  consider  in  91  „  an  arbitrary  quadric  hypersurface: 

n  n 

X  AtfXiXi  “f“  2  2  Ai  fl+i Xi  -f-  An+\  n+i  0 
i,  /“I 

Passing  to  homogeneous  coordinates,  we  get 


.  ^  ^  2  A{  n+|£y£,H.,  +  An+]  fl+15i+i  0  (6) 

»,  /  =  !  i 1 


or 


n+ 1 

X  'daplalfl  “  3 

a,  0*1 


We  again  have  a  homogeneous  equation.  We  can  already  see  the 
convenience  of  using  homogeneous  coordinates  since  the  equation 


424 


PROJECTIVE  SPACE 


[CH.  XII 


of  the  quadric  hypersurface  is  written  in  more  compact  form  be¬ 
cause  its  left  member  is  a  quadratic  form.  Actually,  we  have  al¬ 
ready  employed  homogeneous  coordinates  in  Section  2  of  Chap¬ 
ter  XI  and  found  them  to  be  useful. 

5.  To  the  affine  space  2l„  let  us  adjoin  new  elements  which  we 
will  call  points  at  infinity  ( ideal  points).  We  do  not  offer  any 
pictorial  descriptions  and  will  merely  regard  a  point  at  infinity  as 
an  entity  determined  by  homogeneous  coordinates  (gi,  . . . ,  g„,  gn+i), 
provided  that 

In+I  =0  (7) 

and  that  among  the  numbers  (gi,  ....  g„)  at  least  one  is  different 
from  zero.  Also,  we  assume  that  the  sets  (£gi,  . . . ,  >.g„,  0)  for  all 

=/=  0  specify  one  and  the  same  ideal  point,  while  nonproportional 
sets  define  distinct  points. 

We  will  say  that  a  certain  ideal  point  belongs  to  a  given  hyper¬ 
plane  or  a  given  quadric  hypersurface,  and  so  on,  if  its  coordinates 
satisfy  the  equation  of  the  hyperplane  or  the  equation  of  the  hy¬ 
persurface,  etc.  Under  a  change  of  the  original  affine  coordinate 
system,  we  assume  that  the  coordinates  of  an  arbitrarily  chosen 
ideal  point  change  via  the  formulas  (3)  or  (4),  where  the  last  line 
is  the  identity  0  =  0. 

6.  We  will  assume  that  every  homogeneous  linear  equation  in 

the  coordinates  . . .  that  is,  every  equation  of  type  (5),  de¬ 

termines  some  hyperplane.  Equation  (7)  is  also  such  an  equation. 
Accordingly,  all  ideal  points  are  viewed  as  constituting  a  hyper¬ 
plane,  which  is  called  the  ideal  hyperplane. 

7.  In  let  us  take  two  parallel  hyperplanes 

-4)1,  +  ...  +  Anln  -f  /4'  +  lgn+l  =  0 
and 

A\h  +  ...  4- 4ngn 4- A"+1gn+l  =  0 

The  intersection  of  each  of  them  with  the  ideal  hyperplane  is 
given  by  one  and  the  same  system  of  equations 

4igi  -f  •••  4*  4„g„  =  0,  "I 

t  -0  f  (8) 

whence  it  follows  that  parallel  hyperplanes  have  common  ideal 
points.  Since  the  system  (8)  has  rank  r  =  2,  the  set  of  all  ideal 
points  common  to  both  parallel  hyperplanes  is  to  be  considered  an 
(ideal)  plane  of  dimension  n  —  2.  (See  Fig.  90;  here  and  henceforth 


THE  CONCEPT  OF  A  PROJECTIVE  SPACE 


425 


*  2] 

we  will  depict  elements  at  infinity  (ideal  elements)  and  the  in¬ 
tersections  of  geometric  figures  at  ideal  points  in  a  conventional 
fashion.)  • 

8.  Assuming  that  every  homogeneous  system  of  linear  equations 
of  rank  r  —  n  —  k  specifies  a  plane  of  dimension  k  in  homogeneous 
coordinates,  we  can  establish,  as  was  done  in  the  preceding  sub¬ 
section,  that  two  parallel  planes  of  the  same  dimension  k  intersect 


Fig.  90  Fig.  91 


at  infinity  in  an  ideal  plane  of  dimension  k  —  1.  In  particular,  any 
two  parallel  straight  lines  intersect  in  a  single  ideal  point 
(Fig.  91). 

9.  The  affine  space  2In  thus  augmented  by  ideal  elements  is 
termed  an  n-dimensional  projective  space.  However,  it  would  be 
more  exact  to  say  that  this  is  one  of  a  number  of  concrete  models 
of  n-dimensional  projective  space,  a  general  description  of  which 
is  given  in  the  next  section. 

§  2.  The  concept  of  a  projective*  space 

1.  Let  us  consider  a  set  of  objects  whose  nature  and  exterior 
aspect  are  immaterial.  All  we  assume  is  that  each  of  these  entities 
is  uniquely  specified  by  an  ordered  set  of  numbers  (gi,  ...,  gn+i). 
These  entities  will  be  called  points  and  each  will  be  denoted  in 
the  usual  manner,  for  instance,  Af(gi,  ....  g,1+i). 

This  set  will  be  called  an  n-dimensional  projective  space,  de¬ 
noted  P„,  if  the  two  following  conditions  hoi d. 

(A)  Any  ordered  set  of  numbers  (gi,  ...,  g„+i)  determines  a 
point  Af(gi,  ....  gn+i)  if  at  least  one  of  the  numbers  gi,  ....  gn+i 
is  nonzero.  An  (n  -j-  1 ) -tuple  consisting  solely  of  zeros  does  not 
determine  any  point. 

(B)  If  A  is  a  scalar  not  equal  to  zero  (A  =5^  0),  then  two  propor¬ 
tional  sets  (gi,  ....  g„+i)  and  (Agi,  . . . ,  Ag,1+i)  determine  one  and 
the  same  point  in  P„.  Nonproportional  sets  determine  distinct 
points  in  P„. 


426 


PROJECTIVE  SPACE 


[CH.  XII 


The  numbers  |i,  . . . ,  g„  are  called  the  homogeneous  coordinates 
of  the  point  M  in  Pn. 

Important  remark.  The  foregoing  is  viewed  as  a  definition  of  an 
n-dimensional  projective  space.  However,  ordinarily  the  definition 
of  a  projective  space  includes  the  description  of  certain  special 
subsets,  called  fc-dimensional  planes  (k  =  0,  1 ,  ....  n) ,  the  system 
of  these  subsets  having  definite  properties.  These  properties  are 
expressed  by  the  axioms  of  the  projective  space.  In  one  and  the 
same  set  of  points,  subsets  of  this  kind  (planes)  may  be  designated 
in  different  ways  or,  as  it  is  common  to  say,  it  is  possible  to  spe¬ 
cify  distinct  projective  structures  on  one  and  the  same  set.  We 
allow  ourselves  to  restrict  the  definition  of  a  projective  space  to 
only  two  axioms,  (A)  and  (B),  because  later  we  will  introduce 
systems  of  subsets  called  planes  in  accordance  with  a  very  definite 
standard  (via  systems  of  linear  equations),  and  we  will  always 
adhere  strictly  to  this  standard. 

Remark.  We  do  not  give  the  axioms  of  a  projective  space,  for 
the  axiomatic  definition  of  a  projective  space  is  more  involved 
than  the  familiar  axiomatic  definition  of  a  linear  space  that  was 
given  in  Section  1,  Chapter  1. 

2.  We  stress  from  the  very  start  that  a  projective  space  is  not 
a  vector  space  for  the  reason  that  no  linear  operations  will  be  de¬ 
fined  in  it. 

3.  A  projective  space  is  said  to  be  real  if  for  its  points 
M(gi,  ....  g„+i)  only  real  values  of  the  coordinates  (gi,  . . . ,  gn+i) 
are  admitted.  If  also  complex  numbers  are  taken  for  the  g h,  then 
the  projective  space  is  said  to  be  complex. 

4.  Given  the  relations 

=  Qii^i  +  •••  +Q,  „+,£„+,> 


^n  +  l  +  +  Qn  +  I  fi+l^n+l 

provided  that  the  (n  -f  1)X(«+  I)  matrix  Q  =  HQ,-,- 1|  is  nonsin¬ 
gular  and  X  0. 

Then,  starting  with  an  arbitrarily  specified  set  of  numbers 
(gi,  ...,  g„+i),  we  can  determine  a  new  set  of  numbers  (A,g',  ... 
....  Ag'  +  ,).  And  if  among  the  numbers  gft  there  is  at  least  one  non¬ 
zero  number,  then  there  will  also  be  a  nonzero  number  among  the 
g^.  This  is  due  to  the  nonsingularity  of  the  matrix  Q. 

We  will  assume  that  the  numbers  (g[ . g'n+|)  are  the  new  cg- 

ordinates  of  a  point  which  had  earlier  been  defined  by  the  coordi- 


i  21  THE  CONCEPT  OF  A  PROJECTIVE  SPACE  427 

nates  (gi,  ....  gn+1).  Also,  as  is  readily  seen,  requirements  (A) 
and  (B)  hold  true  for  the  new  sets  of  niyjibers  g'.  For  any  speci- 
ficationof  the  matrix  Q  we  will  regard  formulas  (1)  as  formulas 
for  the  transformation  of  the  coordinates.  For  the  present  we  will 
not  look  into  the  geometric  meaning  of  these  transformations  (it 
will  be  examined  in  the  next  section). 

Thus,  alongside  the  original  coordinates  gi,  ...,  g„+i  we  intro¬ 
duce  many  other  coordinate  systems,  to  which  we  give  the  name 
projective  systems  of  coordinates  in  Pn.  F.ach  new  system  is  defined 
by  a  specification  of  the  matrix  Q. 

The  original  coordinates  gi,  ....  g„+i  do  not  have  any  preferen¬ 
tial  advantage  over  the  other  coordinate  systems  introduced  by  the 
formulas  (!)■  Indeed,  (I)  may  be  inverted  and  then  the  old  coor¬ 
dinates  will  be  expressed  in  terms  of  new  formulas  of  the  same 
type.  Besides,  if  using  formulas  like  (1)  with  matrix  Q  we  pass 
from  the  coordinates  gj,  ....  gn+1  to  the  coordinates  gj,  ...  g'  +  1 
and  then,  via  similar  formulas  with  matrix  <3  we  pass  from  the 
coordinates  gj,  ....  gjj  to  the  coordinates  gi,  ....  g„+i,  then 

Ii,  . . . ,  gn+i  will  be  expressed  in  terms  of  gi,  . . . ,  gn+i  via  formulas 
of  type  (1)  with  matrix  QQ.  In  short,  the  equivalence  of  all  the 
indicated  coordinate  systems  is  a  consequence  of  the  fact  that 
transformations  of  type  (1)  constitute  a  group  (the  familiar  group 
of  linear  transformations  of  variables). 

5.  Formulas  of  type  (1)  may  be  viewed  from  another  stand¬ 
point.  We  may  say  that  the  system  of  coordinates  does  not  change 

but  that  the  point  M(g] . g„+1)  itself  is  transformed  into  the 

point  Af'  (gj . g,+  ,)-  Regarded  from  this  standpoint,  formu¬ 

las  (1)  specify  a  certain  one-(o-one  transformation  of  projective 
space.  Any  transformation  of  this  type  in  the  space  P„  is  termed  a 
projective  transformation.  All  projective  transformations  of  the 
space  Pn  (that  is,  such  that  correspond  to  all  possible  nonsingular 
matrices  Q)  constitute  a  group.  It  is  called  the  projective  group 
of  the  space  P„. 

6.  It  would  be  a  mistake  to  think  that  the  projective  group  of  P„ 
is  isomorphic  to  the  group  of  nonsingular  (n  -f  1)X(”+  1)  mat¬ 
rices.  The  point  is  that  formulas  of  type  (I)  with  matrices  Q  and 
a Q  (where  a  is  any  nonzero  scalar)  determine  one  and  the  same 
projective  transformation.  In  particular,  when  Q  =  a  £,  a#  0,  we 
obtain,  independently  of  a,  an  identity  projective  transformation 
which  leaves  all  points  fixed. 

7.  The  subject  of  projective  geometry,  that  is,  the  theory  of  pro¬ 
jective  spaces,  involves  objects,  properties,  and  quantities  that  are 


428 


PROJECTIVE  SPACE 


[CH.  XII 


invariant  with  respect  to  the  projective  group.  Let  us  consider 
some  invariants  of  the  projective  group. 

8.  We  use  the  term  hyperplane  in  projective  space  Pn  for  any 
set  of  points  that  is  defined,  in  a  given  coordinate  system,  by  a 
homogeneous  equation  of  the  first  degree: 

+  •••  +  Ai+iln+i  =0  (2) 

We  transform  equation  (2)  in  accord  with  the  projective  trans¬ 
formation  (1)  (see  Subsection  5).  Inverting  (1)  we  get 

is; 

Substituting  this  expression  into  (2),  we  obtain 

A[l\+  ...  +A;t&+,  =  0  (4) 

where  (if . l«+i)  denote  the  running  coordinates  of  a  point 

in  the  same  coordinate  system  in  which  the  hyperplane  (2)  is 
specified,  and  the  coefficients  A'k  are  expressed  by  the  formula 

K=>-%PUA,  (5) 

Since  there  are  nonzero  numbers,  X  =£=  0,  among  the  Ai  and  the 
matrix  P  is  nonsingular,  it  follows  that  there  are  nonzero  numbers 
among  the  A'k.  Therefore  (4)  is  not  an  identity.  Consequently, 
(4)  is  a  first-degree  equation.  Conversely,  if  we  substitute  (1) 
into  (4),  then,  taking  into  account  (5),  we  get  (2).  We  see  that 
points  that  satisfy  (2)  are  carried  into  points  that  satisfy  (4),  and 
conversely.  For  this  reason,  the  images  of  the  points  of  hyper¬ 
plane  (2)  fill  the  entire  hyperplane  (4). 

Conclusion.  Under  a  projective  transformation,  every  hyperplane 
is  carried  into  a  hyperplane. 

Thus,  the  set  of  all  hypcrplanes  in  Pn  is  an  entity  that  is  inva¬ 
riant  with  respect  to  the  projective  group.  Therefore  hyperplanes 
are  part  of  the  subject  matter  of  projective  geometry. 

Remark.  The  transition  from  equation  (2)  to  (4)  may  be  viewed 
otherwise,  namely  as  a  change  to  the  equation  of  the  same  hyper- 
plane  in  another  system  of  projective  coordinates.  From  this  it  is 
evident  that  the  equation  of  a  hyperplane  is  linear  and  homoge¬ 
neous  in  all  projective  coordinate  systems.  It  is  easy  to  see  that, 
generally,  the  projective  invariance  of  a  class  of  entities  is  equi¬ 
valent  to  the  invariance  of  the  class  of  equations  of  these  entities 
with  respect  to  a  transition  from  one  projective  coordinate  system 
to  another. 


THE  CONCEPT  OP  A  PROJECTIVE  SPACE 


429 


*2] 


9.  We  use  the  term  ft-dimensional  (projective)  plane  in  Pn  for 
any  set  of  points  that  is  determined  in  »  given  coordinate  system 
by  some  homogeneous  linear  system  of  equations  of  rank  r  = 
=  n  —  k: 


tfiili  +  +  ^1  n+  l£r» + 1  s== 

ar\\>\  +  •  •  •  +  n  +  l£n  + 1  0 


(6) 


Denote  the  rectangular  matrix  of  system  (6)  by  A.  Transform¬ 
ing  (6)  via  the  formulas  (3),  we  get 


•  “t"  a\  n+l^n+l 

'•  "f”  ar  rt+i^n  +  i 

(7) 


Let  A'  be  the  matrix  of  system  (7).  As  in  Subsection  9,  Section  5, 
Chapter  III,  we  have,  from  (3)  and  (6), 

A'  =  KAP'  (8) 

From  (8)  it  follows  that  rank  A '  —  rank  A  and  so  (7)  specifies  a 
plane  of  the  same  dimension  k. 

From  these  manipulations  it  is  clear  that  under  a  projective 
transformation  a  fc-dimensional  plane  goes  into  a  ^-dimensional 
plane.  Thus,  the  set  of  all  ^-dimensional  planes  in  Pn  is  an  entity 
which  is  invariant  with  respect  to  the  projective  group.  Therefore, 
^-dimensional  planes  come  within  the  subject  matter  of  projective 
geometry. 


10.  Two  planes  in  projective  space  are  said  to  be  skew  if  they 
do  not  have  any  points  in  common. 

In  Pn  let  us  consider  planes  Ph  and  Pi,  of  dimensions  k  and  /, 
specified  by  systems  of  homogeneous  linear  equations,  and  let  us 
combine  all  their  equations  into  a  single  system.  If  the  combined 
system  has  only  a  trivial  solution,  then  P*  and  P;  are  skew,  since 
the  set  (0,  . . . ,  0)  does  not  determine  any  point,  otherwise  the 
planes  intersect.  From  this  it  is  easy  to  compute  that  two  planes 
in  Pn  can  be  skew  only  if  the  sum  of  their  dimensions  is  less 
than  the  dimension  of  the  space: 

k  +  /  <  n 

From  Subsection  9  and  from  the  one-to-oneness  of  projective 
transformations  it  follows  that  under  projective  transformations 
intersecting  planes  go  into  intersecting  planes  and  skew  planes  go 
into  skew  planes. 

Remark.  When  supplementing  affine  space  with  ideal  points, 
it  may  happen  that  planes  which  are  skew  in  tSn,  may  become  in- 


430 


PROJECTIVE  SPACE 


TCH.  XU 


tersecting  planes  of  the  supplemented  space  (concerning  skew 
planes  in  9l„,  see  Section  7  of  Chapter  III). 

11.  We  define  a  quadric  hypersurface  in  the  projective  space  Pn 
as  any  set  of  points  determined  in  a  projective  coordinate  system 
by  a  second-degree  homogeneous  equation: 

(nE  a.lU,  =0 

As  in  the  preceding  case,  we  can  prove  that  the  set  of  all  quadric 
hypersurfaces  in  Pn  is  an  entity  that  is  invariant  with  respect  to 
the  projective  group.  For  this  reason,  quadric  hypersurfaces  belong 
to  the  subject  of  projective  geometry. 

12.  Projective  geometry  treats  of  the  properties  of  hyperplanes, 
^-dimensional  planes,  quadric  hypersurfaces,  etc.,  that  are  inva¬ 
riant  under  any  projective  transformations.  The  dimension  of  a 
plane  is  one  such  property. 

13.  A  one-dimensional  projective  plane  is  called  a  projective 
line. 

Let  us  consider  an  arbitrary  straight  line  in  P„.  It  is  determined 
by  a  homogeneous  linear  system  of  equations  of  rank  n — 1  in 
n  -f-  1  variables.  Hence,  in  the  given  case  the  fundamental  set  of 
solutions  consists  of  two  independent  solutions.  We  denote  them 
by  («i,  ...,  «„+ 0  and  (oi,  ...,  o„+i).  They  are  associated  with 
two  points  U,  V  on  the  line.  Let  ....  £„+i)  be  an  arbitrary 

point  of  this  line.  Since  every  solution  is  linearly  expressible  in 
terms  of  the  fundamental  solution,  we  have 

l,  =  liul  +  vvi,  i=  1 . n+\  (9) 

where  p,  v  are  certain  numbers  not  simultaneously  zero. 

Formula  (9)  expresses  the  geometric  fact  that  a  straight  line  is 
uniquely  determined  by  any  two  of  its  points  and  that  a  straight 
line  can  be  drawn  through  any  two  points  U,  V  ^  Pn. 

The  numbers  p,  v  may  be  viewed  as  the  homogeneous  coordina¬ 
tes  of  point  M  on  the  given  line.  At  the  same  time  we  conclude 
that  a  straight  line  in  projective  space  is  itself  a  one-dimensional 
projective  space. 

Confining  ourselves  to  the  case  of  a  real  space  Pn,  we  put 

1 

c  =  — 7===r,  pc  =  cos  a,  vc  =  sin  a 
Vp  4-  vJ 

Then  from  (9)  we  have 

cl,  —  u,  cos  a  +  v,  sin  a 


U0) 


§  2] 


THE  CONCEPT  OF  A  PROJECTIVE  SPACE 


43! 


where  clt  are  the  coordinates  of  the  same  point  M.  Varying  a 
from  —  oo  to  -f-  00 ,  we  get  all  possible  plants  on  the  given  straight 
line.  Here,  due  to  the  periodicity  of  sine  and  cosine,  every  point  A1 
will  be  repeated  an  infinitude  of  times.  Formula  (10)  shows  that  a 
simple  closed  curve,  say,  that  of  an  ordinary  circle  can  serve  as 
a  pictorial  model  for  a  real  projective  line.  Actually,  when  a  varies 
from  0  to  jc,  M  runs  through  the  entire  projective  line  once  and 
returns  to  its  original  position.  Observe  that  when  an  affine  space 
is  supplemented  with  ideal  points,  every  line  is  supplemented  with 
a  single  point,  which  is  what  makes  it  a  closed  curve  (Fig.  92). 

oo 

^ _ _ _ ~ 


Fig.  92 


A  similar  representation  of  a  real  two-dimensional  projective 
plane  in  the  form  of  a  sphere  is  erroneous.  In  this  connection,  see 
Subsection  1 1  of  the  next  section. 

14.  In  the  case  of  a  6-dimensional  projective  plane,  in  place  of 
(9)  we  have 

h  =  I*,**}”  +  +  •  •  •  +  F*+1«l*+1’  ( 1 0 

where  {mV1}.  {uf,}>  •••>  {«((t+l)}  constitute  the  fundamental  set  of 
solutions  of  a  linear  system  of  equations  of  type  (6),  and  pi,  .... 
ps+i  are  numbers,  some  of  which  are  nonzero.  They  may  be 
viewed  as  the  homogeneous  coordinates  of  a  point  in  a  6-dimen¬ 
sional  plane. 

From  this  it  is  clear  that  a  6-dimensional  plane  in  P„  is  itself 
a  6-dimensional  projective  space  (real  or  complex  according  as 
Pn  is  real  or  complex). 

15.  Now  let  us  return  to  the  arbitrary  line  (9)  in  projective 
space  P„.  Together  with  points  U,  V  wc  consider  another  two 
points  on  this  same  straight  line:  M  with  coordinates 

=  +  (12) 

and  N  with  coordinates 

hi  =  m  +  vu. 

We  assume  they  differ  from  U  and  V.  The  number 

V  .  V 

£=  p  •  p 


(13) 


432 


PROJECTIVE  SPACE 


[CH.  XII 


is  called  the  cross  ratio  (or  anharmonic  ratio)  in  which  the  or¬ 
dered  pair  of  points  Af,  N  divides  the  ordered  pair  of  points  U,  V. 

To  denote  g  we  use  the  symbol  ( UVMN ).  Thus 

(£/VAf  AO  =  (14) 

Observe  that  each  of  the  simple  ratios  —  and  is  not  de- 

termined  by  the  points  U,  V,  M  and  U,  V,  N.  This  is  clear  because 
the  homogeneous  coordinates  of  any  one  of  these  points  may  be 
multiplied  by  any  nonzero  number. 

However,  the  cross  ratio  has  a  very  definite  numerical  value, 
which  depends  solely  on  the  specification,  on  the  line,  of  the  or¬ 
dered  pairs  of  points  U,  V  and  Af,  N. 

Indeed,  if  the  coordinates  of  the  points  U,  V,  Af,  N  are  multi¬ 
plied  respectively  by  four  arbitrary  factors  0,  then  the  fractions 
v/|x  and  v/ji  will  be  multiplied  by  one  and  the  same  factor,  which 
is  cancelled  out  in  (14). 

A' _ x, _  m  xz 

xk  U  x3  V 

Fig.  93 

What  is  more,  a  cross  ratio  remains  unaltered  when  passing  to 
any  new  system  of  projective  coordinates  of  the  space.  This  is 
equivalent  to  the  fact  that  a  cross  ratio  is  invariant  under  any  pro¬ 
jective  transformations.  The  proof  of  this  is  given  in  Subsection  16. 

Let  us  consider  in  more  detail  the  geometric  meaning  of  the 
cross  ratio.  Take  four  distinct  points  U,  V,  Af,  N  on  an  ordinary 
straight  line.  Assume  that  on  this  line  an  affine  coordinate  x  has 
been  introduced  that  takes  on  the  values  jci,  x2,  x3,  x4  at  the  res¬ 
pective  points  (Fig.  93).  Passing  to  homogeneous  coordinates 
(|,  q)  via  the  formula  x  =  l/x],  t]  #  0,  and  setting  rj  =  1,  we 
have  the  following  homogeneous  coordinates  of  the  points  in 
question: 

£/(*„  1).  V{x2,  1),  M(x3,  1),  N(x4,  1) 

Then  (12)  and  (13)  become 

*3  =  F*i  +  v*2)  |  x4  =  jxjq  +  vx2,  ) 

1  =  p  +  v  J  l  =  ji  +  v  j  ^  ^ 

From  the  systems  (15)  we  find  p,  v,  ji,  v  and,  substituting  them 
into  (14),  we  get 

*3  —  *1  .  —  Xi 


(UVMN) 


—  *3  X3  —  *4 


(16) 


THE  CONCEPT  OE  A  PROJECTIVE  SPACE 


433 


§  2] 


From  elementary  analytic  geometry  we  know  that  the  fractions  in 
the  right  member  of  (16)  are  the  ratios  X  alid  X  in  which  the  points 
Af  and  N  respectively  divide  the  segment  UV: 


Thus 


UM  x 3  —  x |  r  UN  xt  —  x, 

MV  x2  —  x3  '  NV  x2  —  x, 


(UVMN)  =  x:i  = 


UM  . 

MV  ‘  NV 


(17) 


for  any  four  distinct  points  on  the  affine  line. 

16.  The  cross  ratio  of  two  pairs  of  points  is  an  invariant  quantity 
with  respect  to  the  projective  group. 

To  prove  this,  consider  an  arbitrary  projective  transformation. 
By  formulas  (3)  we  have 

h=^ZPlitr  =  u,  =  x£v;  (is) 

Since  the  matrix  ||  P, *  ||  is  nonsingular,  from  (12)  and  (18)  we 

get 

s;  =  iiU;  +  vt»;  09) 

where  u'p  v'r  l't  are  the  coordinates  of  the  points  U',  V',  and  AT, 
which  are  images  of  the  points  U,  V  and  M,  respectively.  In  similar 
fashion  we  obtain  the  coordinates  of  the  image  N'  of  N: 

r]'  =  a«;  +  vu;  (20) 


From  (19)  and  (20)  we  get 

(U'V'M'N')  =  -J- :  |  =  (UVMN) 


which  completes  the  proof. 

17.  At  the  conclusion  of  the  preceding  section  we  remarked  that 
an  affine  space  2l„  supplemented  with  ideal  elements  is  a  model 
of  the  general  concept  of  an  n-dimensional  projective  space.  It 
must  be  stressed  that  if  we  consider  a  supplemented  affine  space 
as  a  projective  space  and  admit  any  transformations  of  type  (1), 
Section  2,  then  we  must  not  treat  ideal  elements  preferentially  and 
must  regard  them  as  being  on  a  par  with  ordinary  elements,  since 
we  can  carry  any  such  ideal  point  into  an  ordinary  point  via  a 
projective  transformation.  Also  note  that  the  definition  of  a  pro¬ 
jective  space  does  not  isolate  any  elements  as  being  at  infinity. 


434 


PROJECTIVE  SPACE 


(CH.  XII 


Therefore  the  set  of  ideal  points  in  the  supplemented  affine  space 
is  not  an  invariant  entity  with  respect  to  the  projective  group.  For 
this  reason,  the  concept  of  points  at  infinity  (or  ideal  points)  does 
not  come  within  the  sphere  of  projective  geometry. 

18.  By  means  of  a  transformation  of  coordinates  in  Pn  we  can 
make  any  preassigned  hyperplane  have  an  equation  |n+]  =  0  and 
we  can  agree  to  consider  it  to  be  infinitely  distant.  The  indication 
of  precisely  which  hyperplane  is  taken  for  the  ideal  hyperplane 
can  be  regarded  as  a  return  from  projective  space  to  affine  space. 

19.  If  homogeneous  coordinates  are  introduced  in  affine  space, 
then,  as  can  readily  be  verified,  any  affine  transformation  is  given 
by  formulas  of  type  (4),  Section  1,  with  a  nonsingular  matrix  of 
coefficients  being  affine.  A  comparison  of  (4),  Section  1,  with  (1)  of 
this  section  will  make  it  clear  that  the  affine  transformations  of 
the  space  5l„  may  be  taken  as  a  special  case  of  projective  trans¬ 
formations  in  a  supplemented  affine  space,  that  is,  in  Pn.  Namely, 
we  can  regard  as  affine  all  transformations  of  type  (1)  that  pre¬ 
serve  ideal  points  as  ideal  points. 

Indeed,  if  from  the  condition  %n+i  =0  we  necessarily  obtain 
|„+i  =  0,  then  the  projective  transformation  must  have  the 
form  (3),  Section  1.  If  we  are  only  interested  in  the  points  of  the 
space  21  „  itself,  then  |n+i  #  0  and  from  (3),  Section  1,  we  get 
formulas  (2),  Section  1,  that  express  an  affine  transformation. 

Important  corollary.  The  group  of  all  affine  transformations  in 
21, ,  is  a  subgroup  of  the  projective  group  of  the  space  Pn. 

Remark.  For  this  formulation  it  is  very  important  that  we  agree 
to  regard  as  affine  certain  special  projective  transformations.  We 
have  thus  included  affine  transformations  in  the  category  of  projec¬ 
tive  transformations. 

Due  to  the  fact  that  the  range  of  projective  transformations  is 
richer  than  that  of  affine  transformations,  the  projective  group  has 
fewer  invariants  than  the  affine  group:  every  invariant  of  the  pro¬ 
jective  group  is  an  invariant  of  the  affine  group,  but  the  converse 
is  not  true.  For  example,  each  of  the  ratios  (17)  is  an  affine  inva¬ 
riant  but  is  not  a  projective  invariant  (the  latter  follows  without 
computations  from  Subsection  3,  Section  5). 

On  the  other  hand,  in  contrast  to  affine  geometry  (and  all  the 
more  so  to  metric  geometry)  projective  geometry  treats  of  the  pro¬ 
perties  of  geometric  figures  that  are  more  stable  in  the  sense  that 
they  are  preserved  under  all  transformations  of  a  more  extensive 
group. 

20.  The  material  of  Subsections  8  to  19  serves  to  illustrate  the 
brief  statement  contained  in  Subsection  7. 


*3] 


A  BUNDLE  OF  PLANES  IN  AFFINE  SPACE 


435 


§  3.  A  bundle  of  planes  in  affine  space 

* 

1.  Consider  in  (n  -f-  l)-dimensional  affine  space  2ln+i  the  set  of 
all  planes  of  all  dimensions  (this  includes  straight  lines)  passing 
through  a  fixed  point  0.  This  set  is  called  a  bundle  (sheaf)  of  pla¬ 
nes  with  centre  0.  From  now  on  we  will  denote  both  the  bundle 
and  the  centre  by  O. 

Take  O  for  the  origin  of  an  affine  system  of  coordinates  with  an 

arbitrary  basis  . . .  en+I.  We  leave  the  origin  unchanged,  and 

so  we  will  identify  2ln+1  and  the  corresponding  linear  space  Ln+ 1- 

Every  straight  line  of  the  bundle  is  uniquely  determined  by  spe¬ 
cification  of  a  point  M  other  than  O.  Let  gi,  . . . ,  gn+)  be  the  coor¬ 
dinates  of  M  in  the  basis  eu  . . . ,  en+\.  Then  for  an  arbitrary  A  =#=  0 
the  point  Agi,  Ag„+i  determines  the  same  straight  line  OAf. 

A  straight  line  regarded  as  an  element  of  the  bundle  will  be 
called  a  point,  the  numbers  gi,  . . . ,  g„+(  are  its  homogeneous  co¬ 
ordinates.  Then  it  is  clear  that  the  set  of  such  points  (that  is,  of 
straight  lines  of  the  bundle  O)  constitutes  an  n-dimensional  pro¬ 
jective  space  Pn.  It  is  also  clear  that  every  (k  -f-  1) -dimensional 
plane  of  the  bundle  O  is  a  fe-dimensional  projective  plane  in  Pn , 
since  it  passes  through  the  origin  and,  consequently,  is  determined 
by  a  system  of  homogeneous  linear  equations  of  rank  r  — 
=  (n  -f  1)  —  (k  -f-  1)  =  n  —  k. 

We  have  thus  obtained  another  geometric  model  of  a  projective 
space  Pn,  this  time  in  the  form  of  a  bundle  in  an  affine  space  of 
dimension  n -j-  1.  Unlike  Section  1,  here  we  do  not  require  any 
adjoining  of  new  points.  All  elements  of  the  set  under  considera¬ 
tion  (straight  lines  of  a  bundle)  are  geometrically  equivalent. 

2.  We  now  consider  jointly  both  of  our  models  of  an  n-dimen- 
■sional  projective  space.  This  will  help  to  illuminate  the  geometric 
meaning  of  transformations  of  coordinates  and  projective  trans¬ 
formations  in  P„,  which  were  defined  algebraically  in  Section  2. 

For  an  n-dimensional  affine  space  2t„  let  us  take  in  the  space 
2ln+i  a  hyperplane  that  does  not  pass  through  the  centre  of  the 
bundle  0.  To  2ln+1  we  adjoin  ideal  elements  in  accord  with  Sub¬ 
sections  5  to  8,  Section  1.  Then  2l„+i  will  turn  into  the  projective 
space  Pn+ 1  and  the  hyperplane  2l„  will  become  a  projective  space 
of  dimension  n,  which  we  denote 

Every  straight  line  a  of  bundle  0  (the  line  is  supplemented  with 
an  ideal  point)  intersects  2ln  in  some  point  A  (Fig.  94).  We  will 
say  that  the  point  A  corresponds  to  the  line  a.  This  correspondence 
is  one-to-one  because  of  the  adjunction  of  ideal  elements  to  2ln+1; 
namely,  if  a  straight  line  in  affine  space  is  parallel  to  the  hyper¬ 
plane  21, „  then  A  is  an  ideal  point  (Fig.  95). 


PROJECTIVE  SPACE 


ich.  xn 


43r> 


Let  OM  be  a  direction  vector  of  line  a,  OM  =  +  . . .  + 

+  |n+i£n+i-  The  numbers  Xgi,  ....  Agn+i,  which  are  propor¬ 

tional  to  the  coordinates  of  the  vector  OM,  will  be  taken  for  the 
homogeneous  coordinates  of  point  A  in  which  corresponds  to 
line  a  (Fig.  96).  They  coincide  with  the  homogeneous  coordinates 
of  this  line  that  are  defined  in  accordance  with  Subsection  1. 


It  is  clear  that  the  choice  of  basis  eu  ...,  e„+i  in  9ln+i  fully 
determines  the  projective  coordinates  in  the  bundle  0  and  in  the 
hyperplane  This  system  of  projective  coordinates  is  preserved 
under  a  similarity  transformation  of  the  basis  e*  (i.e.  when  chang¬ 
ing  to  a  basis  of  the  form  aelt  ....  aen+l,  a  =A=  0). 


Note  the  particular  case  where  the  vectors  e\ . en  are  parallel 

to  ?l„.  In  this  case,  we  denote  by  O'  the  point  of  intersection  of  the 
hyperplane  with  the  line  of  bundle  O  whose  direction  vector  is 
e„+i.  In  VI, ,  we  introduce  affine  coordinates  with  origin  O'  and 

basis  . .  (Fig.  97).  Then  the  affine  coordinates  x\,  . . . ,  xn 

of  an  arbitrary  point  A  e  Vl„  and  its  homogeneous  coordinates 
glt  ....  gn+t  will  be  related  by  the  formulas  (1)  of  Section  1,  that 
is,  xt  =  g,/g„+i,  i  =  1 . n,  ln+i  ^  0. 


S3) 


A  BUNDLE  OF  PLANES  IN  AFFINE  SPACE 


437 


3.  Now  let  us  return  to  the  general  case.  Let  e',  ... ,  e'n+l  be  a 
new  basis  in  2tn+i.  Then 

rc+I 

e/=ZQi/eJ,  /'=  1,  ....  n+  1  (1) 

If  OAf  is  an  arbitrary  nonzero  vector  on  the  straight  line  a  having 
in  the  old  basis  the  coordinates  gi,  ....  gn+1  and  in  the  new  basis 
the  coordinates  gj,  . . g'  +  l,  then 

n+ 1 

6/'  =  £  Quip  i=  1 . fl+l  (2) 

But  the  numbers  gi,  . . . ,  gn+!  are  the  old  homogeneous  coordinates 
of  A  and  the  numbers  g' . g'+l  are  its  new  homogeneous  coor¬ 

dinates  in  2i„.  For  this  reason  the  formulas  (2)  may  be  viewed  as 
a  transformation  of  the  homogeneous  coordinates  of  points  in  a 
supplemented  hyperplane  (and  also,  at  the  same  time,  in  the 
bundle  0)  under  the  just  described  change  in  the  coordinate  sys¬ 
tem  of  the  space  5l„+i.  In  place  of  (2)  we  can  write  (1),  Section  2, 
since  homogeneous  coordinates  are  defined  up  to  the  proportio¬ 
nality  factor. 

4.  Now  suppose  a  transformation  of  projective  coordinates  is 
specified  via  (1)  of  Section  2  in  an  n-dimensional  projective 
space  Pn.  Identifying  Pn  with  the  bundle  0  in  !n+i  and  setting 
A  =  1  in  (1),  Section  2,  we  obtain  formulas  (2).  They  may  be 
viewed  as  formulas  of  a  transformation  of  coordinates  in  2l„+i  un¬ 
der  which  the  origin  remains  fixed  and  the  basis  is  transformed 
by  formulas  (1). 

5.  To  summarize,  then,  we  can' say  that  the  projective  coordi¬ 
nates  in  Pn  are  determined  by  specification  of  the  basis  e\ . en+i 

in  5l„+i,  and  formulas  (1)  of  Section  2  express  the  change  to  a 
new  projective  system  of  coordinates  in  Pn,  which  system  is  given 
by  the  new  basis  e\,  ....  e'n+\  in  9tn+i.  It  is  immaterial  here  which 
of  the  two  models  of  Pn  are  considered:  the  bundle  O  or  the  hyper¬ 
plane 

6.  Formulas  (1),  Section  2,  can  be  similarly  interpreted  if  we 

assume  that  they  specify  a  projective  transformation  in  Pn.  To  do 
this  we  have  to  consider  in  2l„+i  an  affine  transformation  that 
leaves  the  point  O  fixed  and  is  given  by  (2).  Then  an  arbitrary 
point  M  e  ?ln+i  with  coordinates  gi  ....  gn+i  goes  into  the  point 
Af  (g{ . l«+i)-  H  we  are  not  interested  in  the  point  Af  itself, 


438 


PROJECTIVE  SPACE 


ICH.  XII 


but  only  in  the  line  OAT  into  which  OM  goes,  then  in  (2)  we  can 

multiply  the  numbers  £1 . g„+i  or  the  numbers  g'+1  by 

any  number  K  =£  0.  If  we  put  the  factor  X  in  the  left  members  of 
formulas  (2),  then  we  get  formulas  (1)  of  Section  2. 

Thus  we  can  say  that  any  projective  transformation  in  the 
augmented  hyperplane  2in  is  induced  by  an  affine  transformation 
in  ‘A„+1  which  leaves  fixed  the  point  0  (or,  what  is  the  same  thing, 
by  a  nonsingular  linear  transformation  in  Ln+\). 

If  we  take  the  bundle  0  for  a  model  of  the  projective  space  P„, 
then  we  can  say  that  the  projective  transformations  in  P„  are 
merely  affine  transformations  in  2l„+i  that  preserve  the  point  O, 
bearing  in  mind  that  points  in  P„  are  the  straight  lines  of  the 
bundle  0. 

7.  To  get  a  complete  picture  of  what  has  been  said  about  the 
model  of  a  projective  space  in  the  form  of  a  bundle  O  in  2ln+i  and 
about  projective  transformations  on  this  model  we  advise  the 
reader  to  picture  himself  observing  the  space  2ln+i  from  the  centre 
of  the  bundle  0.  Then  all  points  of  the  space  lying  on  a  single  ray 
of  vision  will  appear  to  be  a  single  point.  Then  if  the  reader,  lo¬ 
cated  at  0,  observes  the  movement  of  points  under  a  nonsingular 
linear  transformation  in  Ln+ 1  =  2l„+i,  he  will  actually  see  a  pro¬ 
jective  transformation  in  the  bundle  or  in  any  hyperplane  not 
passing  through  point  O. 

Note  that  distinct  linear  transformations  Ax  and  ^Ax  (p  any 
nonzero  scalar)  in  2ln+1  are  one  and  the  same  projective  transfor¬ 
mation  in  the  bundle  0. 

8.  As  an  application  of  the  foregoing  constructions,  let  us  look 
into  the  conditions  that  uniquely  determine  a  projective  transfor¬ 
mation  in  Pn.  We  give  the  following  definition  which  will  be  use¬ 
ful  in  the  sequel. 

Definition.  A  set  of  r  -f-  I  points  in  Pn  is  in  the  general  position 
if  the  points  do  not  belong  to  a  single  (r —  l)-dimensional  (pro¬ 
jective)  plane. 

It  is  obvious  that  the  total  number  of  points  in  such  a  system 
cannot  exceed  n  +  1. 

Remark.  If  for  Pn  we  consider  a  bundle  0  in  9I„+i,  then  the 

points  a . .  am  e  Pn  are  in  the  general  position  in  the  space  Pn 

if  and  only  if  the  direction  vectors  of  the  straight  lines  a\ . 

ame9l„+i  are  linearly  independent  in  Sln+i.  This  is  evident  if  we 
recall  that,  in  such  a  model,  ^-dimensional  (projective)  planes 
of  P„  are  (k  +  1 ) -dimensional  planes  of  the  bundle  0. 

In  P„  let  there  be  arbitrarily  given  a  set  of  n  -f-  2  points 
a\,  ....  a„4i,  a„f2  such  that  any  n  -f  1  of  the  points  are  in  the  ge¬ 
neral  position.  Again  in  Pn  let  there  be  arbitrarily  given  a  similar 


S3) 


A  BUNDLE  OF  PLANES  IN  AFFINE  SPACE 


439 


set  of  points  aj,  ....  a'n+\,  a'n+ 2.  We  will  grove  that  the  following 
theorem  holds  true. 

Theorem  1.  There  exists  a  unique  projective  transformation  of 
the  space  Pn  that  carries  a,  into  a'  for  all  i  =  1,  . . . ,  n  -|-  2. 

Proof.  For  a  model  of  the  projective  space  P„  we  again  take  a 

bundle  0  in  'Hn+i-  In  O  let  the  straight  lines  a . .  a„+i,  an+2  be 

given  as  inverse-image  points,  and  the  straight  lines  a\,  ....  a„+\, 
a'n+ 2  as  image  points.  Take  a  (nonzero)  direction  vector  en+2 
of  the  straight  line  an+2  and  any  (nonzero)  direction  vectors  e*  of 
lines  a<  (i  =  1,  . . . ,  n  +  1).  By  hypothesis,  the  lines  ai,  ....  an+u 
being  points  of  Pn,  are  in  the  general  position.  Hence,  the  vectors  e,- 

(i  —  1 . «+  I)  constitute  a  basis  in  Sl„+i.  Therefore  we  have 

the  expansion 

en  +  2  — ^1^1+  •••  +^ti  +  |^  +  | 

We  now  prove  that  not  a  single  one  of  the  numbers  is  zero: 

=^=  0,  i  —  1 ,  . . . ,  n  4~  1  (3) 

Let  us  assume  the  contrary.  For  example,  let  X|  =  0.  Then  en+2  is 
linearly  expressible  in  terms  of  the  vectors  e2,  ... ,  en+i  and  so  the 
system  e2,  . . . ,  en+i,  en+2  is  linearly  dependent,  which  contradicts 
the  hypothesis,  since  the  points  a2,  . . . ,  a„+ 1,  an+2  e  P„  are  in  the 
general  position.  Thus  A,i  0.  The  remaining  inequalities  of  (3) 
are  established  in  a  similar  manner. 

Set  ef  =  Xi eit  i  =  1,  . . . ,  n  -f  1.  Then 

en+2  —  e\Jr  +e„+1  (4) 

From  the  independence  of  the  vectors  e\ . en+l  and  the  ine¬ 
qualities  (3)  it  follows  that  the  vectors  ei . en+i  are  also  inde¬ 

pendent.  In  similar  fashion  it  can  be  proved  that  there  are  vec¬ 
tors  e'i  lying  respectively  on  the  straight  lines  at  such  that 

e'n+2  —  e'i  +  •  •  •  +  Cfi+i  (5) 

and  e\,  ....  e'n+i  are  linearly  independent  in  2ln+i.  whence  and 
also  from  Subsection  6,  Section  3.  Chapter  VII,  it  follows  that  in 
2ln+1  =  Ln+\  there  is  a  nonsingular  linear  transformation  x'  =  Ax 
for  which  e'i  =  Ajt,  i  =  1,  ....  n+  I.  Using  (5),  we  find  that 

e'n+ 2  —  s'\-\-  ...  +  e'n+t  =  Aei  +  ...  4-  Aen+i 

—  A(e  1  +  ...  4- en+{)  —  Aen+2 

Thus  e't  =  Act  for  all  i  =  1,  . . . ,  n  4-  1,  n  +  2;  that  is,  the  linear 
transformation  x'  =  Ax  in  9I„+i  carries  the  given  straight  lines 
ax,  . . . ,  all+2  into  the  given  straight  lines  a\,  ....  a'n+ 2  and,  hence, 
induces  the  desired  projective  transformation,  which  we  denote 
by  f,  in  the  bundle  0. 


MO 


PROJECTIVE  SPACE 


1CH.  XII 


We  prove  uniqueness.  Let  there  be  another  projective  transfor¬ 
mation  cp  in  the  bundle  O  for  which  qj  (a, )  =  a'-,  i  =  1,  . . . ,  n  +  2. 
By  Subsection  8  it  is  induced  by  a  nonsingular  linear  transforma¬ 
tion  B  of  space  9l„+i,  which  transformation  also  carries  lines 
a,  into  aj  (t  =  1, . . . ,  n  +  2).  Then 

e"\  =  Be  i  =  ine'i  (6) 

where  p,  are  certain  numbers;  i  =  1,  . . . ,  n  -f  2.  Here,  due  to  (4) 
and  (6),  we  have 

e'h+2—  Ben+2  =  Be\  +  . ..  +  Ben+\  —  Pi#!  +  •••  +Pn+i£n+i  (7) 
On  the  other  hand,  using  (5)  and  (6),  we  find  that 

C'n+2  =  Prt+2^n+2  =  Prt+2^!  +  ...  +Pn+2^n+l  (8) 

From  (7)  and  (8)  follows 

Fn+2  =  Fi  =  F2=  •••  =Fn+i  (9) 

since  the  vectors  e\,  ...,  e'n+\  are  independent.  Putting  pn+2  = 
we  get 

Be,  =  \y.e'i  =  p/le, ,  /  =  1 ,  . . . ,  n  +  1 

But  then  Bx  =  pAx  for  any  vector  x  in  9tn+i,  but  this  means  that 
<p  =  /,  which  completes  the  proof  of  Theorem  1. 

9.  We  have  already  pointed  out  that  the  coordinates  in  Pn  are 
determined  by  a  specification  of  the  basis  in  2l,i+i.  But  in  9l„+i  the 
basis  is  an  entity  that  is  exterior  to  the  space  Pn.  It  is  desirable 
to  have  a  different  way  of  specifying  projective  coordinates  that  is 
based  solely  on  a  consideration  of  entities  of  the  projective  space 
itself. 

Suppose  in  Pn  we  have  an  arbitrarily  chosen  and  fixed  set  of 
n  +  2  points 

A\,  A';,  ...,  /l„+i,  B  (10) 

such  that  any  n  +  1  points  are  in  the  general  position. 

Theorem  2.  In  an  n-dimensional  projective  space  Pn  there  is  a 
unique  set  of  projective  coordinates  in  which  the  points  (10)  have 
the  following  homogeneous  coordinates: 

Ai  (1,  0,  ...,  0), 

A2  (0,  1 . 0), 

.  (H) 

An+\  (0,  0,  . . . ,  1), 

B  (1,  1,  ....  1) 


A  BUNDLE  OF  PLANES  IN  AFFINE  SPACE 


441 


Ml 

Proof.  For  a  model  of  Pn  let  us  consider  the  bundle  0  in  2tn+i. 
As  in  the  proof  of  Theorem  1,  we  fin&  the  direction  vectors 

Ci,  ,  en+i,  cn+2  of  the  straight  lines  A . . /4n+1,  B  of  bundle  O 

such  that 

(a)  e„+2  =  ei+  •••  +<?n+i> 

(b)  ci,  ... ,  c„+1  are  linearly  independent  in  9l„+i. 

We  take  the  vectors  e\ . c„+i  for  a  basis  in  9l„+i.  Tlien,  by 

Subsection  1,  the  homogeneous  coordinates  of  the  points  will  be 
determined  in  Pn  and  conditions  (11)  will  be  fulfilled  (the  last  one 
due  to  (a)). 

We  now  prove  uniqueness.  Let  a  projective  coordinate  system  in 
Pn  be  specified  by  a  different  basis  eu  . . . ,  en+i  of  the  space  2l„+i 
and  let  conditions  (11)  hold  again.  Then,  by  virtue  of  (11),  the 
vectors  <?i,  ...,  e„+i  are  necessarily  direction  vectors  for  the 
straight  lines  A  1(  . . . ,  An+l  of  bundle  0,  that  is,  et  =  (a i  =¥=  0, 

n+  I 

i  =  1,  . . . ,  n  +  1).  Besides,  £  et  =  an+2en+2,  a„+2  ^  0  since  the 

(=i 

vector  Ci  +  . . .  +  c„+i  must  be  a  direction  vector  for  the  straight 
line  BeO.  Then,  as  in  (7)  to  (9),  it  is  established  that  a,  =  . . . 
. . .  =  an+i,  whence  it  follows  that  the  homogeneous  coordinates 

ii  . In+i  of  point  X  that  are  specified  via  the  basis  cj,  . . . ,  en+i 

are  proportional  to  its  coordinates  gi . £n+i  derived  from  the 

basis  Ci,  ....  c„+i.  The  proof  of  Theorem  2  is  complete. 

10.  Remark.  To  illustrate  the  geometric  meaning  of  Theorem  2 
let  us  consider,  as  a  model  of  Pn,  the  augmented  affine  space 
assuming  that  the  homogeneous  coordinates  gi,  . . . ,  gn+i  are  in¬ 
troduced,  in  accordance  with  Subsections  1  to  5  of  Section  1,  on 


the  basis  of  a  certain  affine  system  of  coordinates  *i . xn  in  9l„. 

Then  the  points  having  coordinates  (11)  play  the  following  roles: 

A„+i(0,  ....  0,  1)  serves  as  the  origin  of  the  affine  coordinate 
system  in  2l„; 


442 


PROJECTIVE  SPACE 


ICH.  XII 


At,  A2,  ■  ■  ■ ,  An  are  ideal  points  of  the  coordinate  axes  xu  *2,  •  •  ■ . 
xn  respectively;  their  choice  determines  the  direction  of  the  coordi¬ 
nate  axes; 

the  point  B(l, ....  1,  1),  called  the  units  point,  defines  a  choice 
of  basis  vectors  on  each  of  the  axes  x\,  . . . ,  x„  (Fig.  98). 

11.  We  conclude  this  section  with  a  very  popular  geometric 
model  of  a  real  projective  space  Pn ■  We  assume  that  a  Euclidean 


Fig.  99  Fig.  100 


metric  has  been  introduced  into  9l„+i,  and  together  with  the  bundle 
0  we  consider  an  n-dimensional  sphere  5  with  centre  0  (for  a 
definition  of  an  n-dimensional  sphere  see  Subsection  5,  Section  6, 
Chapter  XI).  Each  straight  line  of  the  bundle  intersects  the  sphere 


Fig.  101 


in  two  diametrically  opposite  points  (for  n  —  1,  see  Fig.  99).  We 
identify  such  points,  that  is,  each  pair  of  diametrically  opposite 
points  of  the  sphere  S  will  be  regarded  as  one  point  of  a  new  set. 
In  construction,  the  set  thus  obtained  is  in  one-to-one  corres¬ 
pondence  with  the  bundle  0  and,  hence,  may  be  taken  as  a 
space  Pn.  In  this  model,  every  ^-dimensional  projective  plane  is 
depicted  in  the  form  of  a  ^-dimensional  sphere  with  identified  dia¬ 
metrically  opposite  points,  since  each  ( k  -f-  l)-dimensional  plane 
of  bundle  0  intersects  the  sphere  S  along  some  sphere  of  dimen¬ 
sion  k  (see  Fig.  100  where  n  =  2,  k  —  1). 


Sfl  CENTRAL  PROJECTION  443 

Note  that  in  the  particular  case  n  =  1  ^nd  in  this  case  alone) 
it  is  possible  to  depict  P„  as  a  one-dimensional  sphere,  that  is,  in 
the  form  of  a  circle  without  identifying  diametrically  opposite 
points.  Namely,  on  a  two-dimensional  plane  consider  a  bundle  0 
(in  other  words,  a  pencil  of  straight  lines  passing  through  point  0) 
and  a  circle  S  passing  through  the  same  point  (Fig.  101).  With  O 
on  the  circle  S  we  associate  a  straight  line  o  tangent  to  the  circle 
at  this  point.  With  any  other  point  ylsSwe  associate  the  straight 
line  CM.  We  obtain  a  one-to-one  correspondence  between  the  lines 
of  bundle  0  and  the  points  of  circle  S  that  allows  us  to  regard 
the  circle  as  a  model  of  a  projective  line  (in  this  connection  see 
Subsection  13,  Section  2). 


§  4.  Central  projection 

1.  Above  we  considered  projective  transformations  in  a  given 
space  Pn.  This  concept  can  be  generalized  to  the  consideration  of 
a  projective  mapping  of  one  n-dimensional  projective  space  Pn 
onto  another  n-dimensional  projective  space  P'„,  on  the  assumption 
that  this  mapping  is  given  by  formulas  (1),  Section  2,  with  the 
proviso  that  (£i,  ....  |„+i)  are  the  coordinates  of  the  inverse  image 
in  the  space  Pn,  and  (|i,  ....  ^+i)  are  the  coordinates  of  the 
image  in  the  space  Pn.  In  this  section  we  consider  an  important 
special  case  called  central  projection. 

2.  From  Subsection  10,  Section  2,  it  follows  that  in  Pn  any 
straight  line  and  any  hyperplane  intersect. 

It  is  easy  to  verify  that  if  the  straight  line  a  does  not  lie  entirely 
in  the  hyperplane  P,  then  they  have  a  unique  point  in  common.  To 
prove  this,  adjoin  the  equation  of  the  hyperplane  P  to  the  system 
of  equations  of  rank  n — 1  that  define  a.  If  a  and  P  have  two 
distinct  common  points,  then  the  combined  system  of  equations  has 
two  independent  nontrivial  solutions  so  that  its  rank  is  r  sg  n  — 1. 
But  this  is  only  possible  if  the  equation  of  the  hyperplane  P  is  a 
consequence  of  the  equations  of  the  straight  line  a,  that  is,  when 
ac:P. 

3.  Now  suppose  in  Pn  we  have  chosen  two  different  hyperplanes 
P  and  P'  with  an  arbitrary  fixed  point  0  that  does  not  belong  to 
either  of  these  hyperplanes. 

Pass  a  straight  line  OM  through  an  arbitrary  point  Af  in  hyper¬ 
plane  P  and  through  point  0.  By  Subsection  2,  the  line  OM  will 
intersect  P'  in  a  unique  point  AT,  and  for  every  point  AT  s  P' 
there  will  be  a  unique  point  AI  e  P  such  that  Af,  0  and  M'  are 
collinear  (Fig.  102). 


444 


PROJECTIVE  SPACE 


[CH.  XII 


We  call  AT  the  projection  of  M  from  the  centre  0  on  the  hyper¬ 
plane  P'.  We  will  also  write  M'  =  f(M).  The  one-to-one  mapping 
M'  =  f(M)  of  hyperplane  P  onto  hyperplane  P'  is  called  the 
central  projection  of  P  on  P'  from  the  centre  0. 

4.  In  place  of  the  hyperplanes  P,  P'  we  can  consider  two  planes 
Pk,  Pk  of  the  same  arbitrary  dimension  k  and  determine  the  central 
projection  of  one  of  them  on  the  other;  but  a  special  mutual  ar¬ 
rangement  of  the  planes  Pk,  Pk  and  the  point  0  is  required. 


Fig.  103 


For  instance,  suppose  that  Pkh  Pk  and  0  belong  to  a  (h  -f-  1) -di¬ 
mensional  projective  plane  Pk+i  in  the  space  Pn.  We  will  also 
assume  that  the  planes  P*  and  p'k  do  not  coincide  and  that  the 
point  O  does  not  belong  to  them.  Then  the  arguments  and  con¬ 
structions  of  Subsections  2  and  3  are  fully  applicable,  since  P* 
and  P’k  may  be  regarded  as  hyperplanes  in  a  (k  4-  l)-dimensional 
projective  space,  which  is  what  the  plane  Ph+\  (see  Fig.  103  in 
which  At  =  1 )  is. 

5.  Theorem.  The  central  projection  of  plane  Ph  onto  plane  P'k 
(1  ^  k  ^  n —  1)  is  a  projective  mapping. 

Proof.  We  know  that  Pk  and  P'k  can  be  regarded  as  projective 
spaces  (Section  2,  Subsection  14)  and  that  the  central  projection 
M'  ==  f(M)  is  a  one-to-one  mapping  of  Pk  onto  P'k  (for  k  <  n  —  2 
due  to  the  special  mutual  arrangement  of  the  planes).  It  is  there¬ 
fore  sufficient  for  us  to  determine  the  type  of  formulas  that  specify 
this  mapping. 

In  Pn  we  choose  a  system  of  coordinates  so  that  point  0  has  co¬ 
ordinates  ...  =  —  0,  |n+i  =5^  0.  By  formula  (11)  of  Sec¬ 

tion  2,  an  arbitrary  point  M  e  Ph  has  coordinates 

...  +n*+1«r,) 

where  i //'  are  certain  numbers.  Similarly,  the  arbitrary  point 


*4] 


CENTRAL  PROJECTION 


445 


M'  e  P*  has  the  coordinates  % 

6,  “(*;<+  •••  +^+1^+,) 

Here,  pi, . . . ,  pA+t  may  he  considered  the  homogeneous  coordinates 

of  M  inside  P*,  and  pf . p*+>  the  homogeneous  coordinates 

of  M'  inside  P*.  If  0,  M,  M'  are  coll  inear,  then 

6i  =  o6/+KJ.  /=  I . »+l  0) 

Here  a=+0  and  p  +=  0  since  none  of  the  points  Af(|,)  and 

Af'(|(/)  coincides  with  the  point  0(i-").  For  i—  I . n  we  get 

all  +  Pi;  =  0  but  airt+i  +  Pl^+i  +=0.  Conversely,  the  equations  (1)  are 
ensured  for  certain  numbers  a  =+  0,  p  =+  0  when  i—  1 ,  . . . ,  n,  and  if 
o|rt+i  4-Pln+i  ¥=0,  then  the  points  M(h)  and  M' (l'i)  are  collinear 
with  point  0.  Thus,  if  we  want  to  find  the  projection  of  M' 

from  the  given  point  Af,  we  have  to  solve  the  system  of  equa¬ 

tions 

|i'l><i1>+  ...  +  Pfc+1if+l)  =  X(p1u(,1)  +  ...  +  P*+i“;ft+1))* 

i=  1,  2 . n  (2) 

assuming  p,,  ....  p4+l,  uSP,  v\h  and  X+=0  to  be  given  and  p(, 
....  pife+i  to  be  sought.  Then  for  a  =+  0  and  p  +=  0  we  can  take  any 
numbers,  provided  that  X  =  —  y.  Besides  we  must  have 

nM;,+  •••  +!*.+  ,«."  •••  +  !*»+, <C")  <3) 

We  know  definitely  that  the  point  W  exists  and  is  unique.  There¬ 
fore  system  (2)  is  uniquely  solvable  and  inequality  (3)  holds. 
Besides  that,  the  matrix  made  up  of  the  columns  v\'\  . ..,  i><fc+l), 
where  i  =  1,  2,  ....  n,  has  rank  6  +  1  (see  Subsection  14,  Sec¬ 
tion  2).  And  so  it  has  a  basis  minor  D  of  order  k  +  1.  For  the  sake 
of  simplicity,  let  us  suppose  the  minor  D  is  composed  of  the  first 
k+l  rows.  Then  in  (2)  we  take  the  first  k  +  1  equations  and 
discard  the  rest.  Solving  this  system  by  Cramer’s  rule,  we  get 

Pi=  ^(-^TPi  +  •••  -f - 

.  (4) 

/  •>  (  Dk+i  i  ..  i  I  Oft  ti  ft+i  \ 

Pft+l—  — o Fl  +  •••  T - Q  Pft+lJ 

where  Du  is  a  determinant  obtained  from  D  by  replacing  the  / th 
column  with  the  column  u{P  0=1,  ....  6+1).  We  see  that,  to 
within  the  factor  X,  the  p(,  ...,  ps+i  are  expressed  by  linear  and 
homogeneous  formulas  in  terms  of  pi . ps+i.  The  coefficients  of 


440 


PROJECTIVE  SPACE 


[CH.  XII 


these  formulas,  that  is,  ^Dn,  constitute  a  nonsingular  matrix, 

for  otherwise  it  would  be  possible  to  find  ail  nonzero  numbers 
pi,  ... ,  ph+i  for  which  p|  =  ...  =  p*+,  —  0.  But  this  means  that 
a  certain  point  M  e  Ph  does  not  have  a  projection  on  Pk,  contrary 
to  the  hypothesis.  We  have  thus  established  that  the  central  pro¬ 
jection  Pk  onto  Pk  from  centre  0  is  given  by  the  homogeneous 
linear  formulas  (4)  with  a  nonsingular  matrix  of  coefficients.  The 
proof  of  the  theorem  is  complete. 

Thus,  central  projection  is  a  special  case  of  a  projective 
mapping.  Whence,  apparently,  stems  the  term  projective  mapping. 

6.  The  reader  is  advised,  by  way  of  an  exercise,  to  prove  that  if 
a  (one-to-one)  central  projection  of  the  plane  Ph  onto  the  plane  P* 
from  centre  0  is  possible,  then  Pk,  Pk  and  0  lie  in  a  (6  -f-  1 ) -di¬ 
mensional  plane  of  the  space  Pn. 

7.  The  results  obtained  in  Section  2  on  projective  transforma¬ 
tions  automatically  carry  over  to  the  case  of  projective  mappings, 
and  so  from  the  theorem  proved  in  Subsection  5  we  have  the  fol¬ 
lowing  proposition. 


Given  a  straight  line  a  in  the  plane  Pk  and  a  straight  line 
a  e  Pk,  which  is  the  image  of  a  under  a  projection  of  Ph  onto  K 
from  a  centre  0.  Also  suppose  that  U,  V,  Af,  N  are  four  points 
on  a;  U',  V',  AT,  N'  are  their  projections  on  a'  (Fig.  104).  Then 

(U'V'M'N')  =  (UVMN) 

which  is  to  say  that  the  cross  ratio  is  an  invariant  under  central 
projections. 

§  5.  Projective  equivalence  of  figures 

1.  Two  figures  in  a  projective  space  Pn  (that  is,  two  sets  con¬ 
sisting  of  points,  straight  lines  and  6-dimensional  planes)  are 
said  to  be  projectively  equivalent  if  one  of  them  is  carried  into  the 
other  by  means  of  a  projective  transformation. 


*« 


RfcOjECtlVE  EQUIVALENCE  OE  figures 


44? 


Since  all  projective  transformations  form  a  group,  we  have  the 
following  propositions.  * 

(1)  If  a  figure  si  is  projectively  equivalent  to  a  figure  si-',  then 
si'  is  equivalent  to  si. 

(2)  If  figure  si  is  equivalent  to  si' ,  and  si'  is  equivalent  to 
si",  then  si  is  equivalent  to  si". 

(3)  Every  figure  is  equivalent  to  itself. 

In  projective  geometry  projectively  equivalent  figures  are  not 
distinguished,  just  as,  in  metric  geometry,  congruent  figures  are 
indistinguishable. 

2.  Theorem  1.  In  n-dimensional  projective  space,  any  two  planes 
of  the  same  dimension  are  projectively  equivalent. 

Proof.  Given  an  arbitrary  ^-dimensional  plane  Pft  specified  by 
the  system  of  equations 


Gull  +  •  •  • 

+  a\  n  +  lln  +  l  —  0>  | 

Grill  +  •  •  • 

+  Grn  +  ll/t  +  l  =0  * 

(1) 

the  rank  of  which  is  equal  to  the  number  of  equations  (r  =  n  —  k). 
Consider  the  projective  transformation 


|l  =  Gllll 

+ 

...  +  at  K+iin+i, 

1  r  =  arih 

+ 

...  +arn+||n  +  l, 

lr+1  —  Gr+l 

ill  + 

...  +  ar  +  l  n  +  l|n+l, 

ill  + 

...  -f-  a„+\  rt+ili»+i 

where  the  coefficients  of  the  expressions  l'r+\,  ....  |„+i  are  taken 
at  pleasure  so  long  as  the  matrix  of  the  transformation  (2)  is  non¬ 
singular  (such  a  choice  is  possible  since  the  system  (1)  has  rank 
r  =  n  —  k).  The  plane  P*  of  transformation  (2)  is  carried  into  a 
completely  definite  plane  £i=0,  ...,  ir  =  0,  which  we  denote 
by  Pfe.  As  in  the  preceding  case,  it  is  established  that  any  other 
6-dimensional  projective  plane  P*  is  projectively  equivalent  to  the 
plane  Pi,  whence  it  follows  that  P*  and  P#,  are  equivalent. 

3.  Let  us  consider  the  projective  line.  On  this  line  all  ordered 
triads  of  distinct  points  are  projectively  equivalent  by  Theorem  1, 
Section  3.  Let  us  now  see  under  what  conditions  quadruples  of 
points  are  projectively  equivalent.  We  will  prove  the  following 
theorem. 


448 


PROJECTIVE  SPACE 


(CH.  XI J 


Theorem  2.  Ordered  quadruples  of  points  U,  V,  Af,  N  and  U',  V\ 
AT,  N'  on  a  single  line  are  projectively  equivalent  if  and  only  if 
their  cross  ratios  are  equal: 

(UVMN)  =  (U'V'M'N')  (3) 

Proof.  The  necessity  of  (3)  follows  from  the  projective  inva¬ 
riance  of  the  cross  ratio.  We  prove  that  it  is  sufficient.  Let  (3) 
hold.  By  Theorem  1,  Section  3,  there  exists  a  projective  transfor¬ 
mation  f  such  that 

U'  =  f  (U),  V'  =  f(V),  M'  —  f  (Af) 

Put  f(N)=N".  Then  (U'V'M'N')  —  (UVMN)  =  (U'V'M'N')  due 
to  the  projective  invariance  of  the  cross  ratio  and  condition  (3). 
But  if  distinct  points  U',  V ,  M'  and  the  cross  ratio  (U'V'M'N')  = 
=  g  are  given,  then  N'  is  defined  uniquely  since  its  homogeneous 
coordinates  are  uniquely  (up  to  a  factor)  expressed  by  the  for¬ 
mulas  of  Subsection  15,  Section  2,  in  terms  of  g  and  the  coordi¬ 
nates  of  U',  V  and  M'.  Therefore  N"  =  N',  and  the  proof  is  com¬ 
plete. 

4.  We  say  that  an  ordered  pair  of  points  MN  divides  an  ordered 
pair  of  points  UV  (located  on  the  straight  line  MN)  harmonically 
if 

(UVMN)  =  —  1  (4) 

In  this  case  we  also  say  that  the  quadruple  of  points  U,  V ,  M,  N 
is  a  harmonic  set  and  that  the  point  N  is  the  fourth  harmonic 
point  for  the  (ordered)  triad  U,  V,  M. 

The  harmonicity  of  the  harmonic  set  of  four  points  is  preserved 
if  we:  , 

(1)  interchange  the  pairs  of  points  Af,  N  and  U,  F; 

(2)  interchange  the  points  within  any  one  of  these  pairs. 

These  properties  follow  immediately  from  formula  (4)  and  the 

formulas  of  Subsection  15,  Section  2. 

5.  Let  the  affine  line  be  supplemented  with  an  ideal  point  N. 
Consider  the  segment  AB  on  this  line;  denote  the  midpoint  by  Af. 

Theorem  3.  The  two  points  MN  harmonically  divide  the  two 
points  AB. 

In  other  words,  the  midpoint  of  the  segment  AB  is  the  fourth 
harmonic  point  relative  to  A,  B,  N,  where  N  is  an  ideal  point. 

Proof.  Introduce  the  affine  coordinate  x  on  the  straight  line  so 
that  -v  =  0  at  point  A  and  x  =  1  at  point  B.  Then  x  —  1/2  at 
point  Af  (Fig.  105).  Also  introduce  the  homogeneous  coordinates 
(£,  t))  putting  .v  =  g/r).  We  then  obtain  the  following  homogeneous 
coordinates  of  the  points  under  consideration:  A(0,  1),  Af(l,  2), 


PROJECTIVE  EQUIVALENCE  OF  FIGURES 


449 


*  5] 

B(l,  1),  JV(1,  0),  whence,  via  formulas  (t2)-(14),  Section  2,  we 
get 

(ABMN)  —  —  1 

6.  Now  let  us  consider  some  figures  on  a  two-dimensional  pro¬ 
jective  plane. 

A  three-point  is  a  set  of  three  points,  none  of  which  is  collinear, 
and  three  straight  lines  joining  the  points  in  pairs  (Fig.  106). 


A/0)  m(j )  BID 

/ 

/ 

/ 

I 

\  NM 

- -o- - 

Fig.  105  Fig.  106 

Since  under  a  projective  transformation  a  straight  line  goes  into 
a  straight  line,  it  follows  from  Theorem  I  of  Section  3  that  any  two 
three-points  on  a  projective  plane  are  projectively  equivalent. 

7.  The  notion  of  a  three-point  affords  a  good  pictorial  illustra¬ 
tion  of  certain  geometric  properties  of  a  real  projective  plane.  We 
give  these  properties  without  taking  up  the  proofs. 


Fig.  107 


Fig.  108 


One  three-point  divides  the  entire  real  projective  plane  into  four 
triangles  labelled  /,  2,  3,  4  in  Figs.  106  and  107  (in  Fig.  107  the 
projective  plane  is  depicted  as  a  sphere  with  identified  diametrical¬ 
ly  opposite  points). 

If  a  polygon  is  decomposed  into  triangles  on  an  affine  plane,  it 
is  possible  to  choose  the  sense  of  traversal  of  each  of  the  triangles 
so  that  the  traversals  of  any  two  adjacent  triangles  are  associated 


15-w 


450 


PROJECTIVE  SPACE 


[CH.  XII 


with  opposite-sense  motions  along  their  common  side  (Fig.  108). 
This  choice  of  traversal  of  the  triangles  may  be  made  in  two  differ¬ 
ent  ways,  which  is  in  keeping  with  the  two  different  orientations 
of  the  affine  plane. 

A  projective  plane  broken  up  into  triangles  does  not  admit  such 
a  choice  of  matched  traversal  of  all  triangles.  For  example,  if  we 
match  the  traversals  of  the  first  and  second  triangles  (Fig.  107), 


and  then  the  second  and  third  triangles,  the  traversals  of  the  first 
and  third  triangles  will  not  be  matched. 

For  this  reason,  we  say  that  the  real  projective  plane  is  non- 
orientable. 

Remark.  On  an  affine  plane  it  is  impossible  to  arrange  not  only 
four  but  even  three  triangles  with  sides  adjoined  in  the  fashion 
observed  in  the  three-point.  However,  it  is  possible  in  three-di¬ 
mensional  affine  space  to  construct  a  model  of  the  mutual  arran¬ 
gement  of  any  three  triangles  of  a  three-point  with  the  aid  of  the 
so-called  Mobius  strip  (a  surface  pasted  together  from  a  twisted 
rectangle,  as  shown  in  Fig.  109).  A  fourth  triangle  of  the  three- 
point  can  be  pasted  to  the  edge  of  the  Mobius  strip  to  obtain  a 
model  of  the  entire  projective  plane  in  the  form  of  a  surface  if  we 


PROJECTIVE  EQUIVALENCE  OF  FIGURES 


451 


§5] 


allow  for  deformation  of  the  pasted  triangles  and  self-intersection 
of  the  surface.  This  self-intersection  can  be  eliminated  by  moving 
into  four-dimensional  space. 


8.  A  figure  in  the  projective  plane  composed  of  four  points,  of 
which  no  three  are  collincar,  and  six  straight  lines  joining  the 
points  in  pairs  is  termed  a  complete  quadrangle. 

The  indicated  points  are  called  vertices  and  the  straight  lines 
joining  them  in  pairs  are  said  to  be  the  sides  of  the  quadrangle. 
Fig.  110  shows  a  quadrangle  ABCD.  Sides  without  a  common 
vertex  are  called  opposite  sides.  The  quadrangle  ABCD  has  three 
pairs  of  opposite  sides:  AB  and  CD,  AC  and  BD,  BC  and  AD.  The 


points  of  intersection  of  opposite  sides  are  called  the  diagonal 
points  of  the  quadrangle.  In  Fig.  110  the  diagonal  points  are  P, 
Q,  R- 

From  Theorem  1,  Section  3,  it  follows  that  any  two  quadrangles 
are  projectively  equivalent  (whereas  sets  of  five  points  on  a  pro¬ 
jective  plane  are,  generally,  not  equivalent). 

Observe  that  all  three  diagonal  points  of  the  quadrangle  have 
equal  status  in  the  sense  that  any  one  of  them  can  be  carried  into 
another  by  a  projective  transformation  that  carries  the  quadrangle 
into  itself.  For  instance,  if  we  want  to  carry  point  P  into  Q  (see 
Fig.  110),  it  suffices  to  take  a  projective  transformation  f  under 
which 

f(A)  =  A,  f  (B)  =  C, 
f(C)  =  B,  f(D)  —  D 


Then  the  line  AB  goes  into  AC  and  CD  goes  into  BD  so  that  the 
points  P  and  Q  are  interchanged,  and  the  quadrangle  ABCD  is 
transformed  to  this  same  quadrangle. 


9.  Given  a  quadrangle  ABCD.  Draw  through  two  of  its  diagonal 
points,  P  and  Q,  a  straight  line  and  denote  by  E  and  F  the  points 


15* 


452 


PROJECTIVE  SPACE 


[CH.  XII 


of  its  intersection  with  the  two  sides  of  the  quadrangle  that  pass 
through  the  third  diagonal  point  R  (Fig.  111). 

Theorem  4.  If  the  foregoing  construction  has  been  carried  out, 
the  following  quadruples  of  points  are  harmonic  sets: 

P,  Q ,  E,  F;  A,  D,  E,  R:  B,  C ,  F,  R 

Proof.  We  assume  the  projective  plane  is  obtained  by  adjoining 
ideal  points  to  the  affine  plane.  Perform  a  projective  transformation 


that  carries  A,  B,  C  and  D  into  the  vertices  A',  B',  C'  and  D',  res¬ 
pectively,  of  a  parallelogram.  Then  P  will  go  to  the  centre  P'  of 
the  parallelogram  A'B'C'D'  (Fig.  112),  Q  and  R  will  go  to  the 
ideal  points  Q'  and  R'  of  lines  A'B'  and  A'D',  respectively.  Line 


Fig.  113 


PQ  will  go  to  the  straight  line  parallel  to  A'B',  the  points  E  and 
F  to  the  midpoints  E'  and  F'  of  the  opposite  sides  A'D'  and  B'C', 
and  P'  will  be  the  midpoint  of  the  line  segment  E'F'.  Therefore, 
taking  into  account  Theorem  3  of  Subsection  5  and  the  projective 
invariance  of  a  cross  ratio,  we  have 


{PQEF)  =  (P'Q'E'F')  =  —  1, 
(ADER)  =  ( A'D’E'R ')  =  —  1 , 
(BCFR)  =  ( B'C'F'R ')  =  —  1 


which  is  what  we  set  out  to  prove. 


10.  We  now  show  how  it  is  possible,  geometrically,  to  construct 
the  fourth  harmonic  point  N  for  the  three  given  points  U,  V,  M  on 
a  Euclidean  plane  ( U ,  V,  M  are  three  distinct  collinear  points). 


$61  PROJECTIVE  CLASSIFICATION  OF  HYPEftSUREACES  453 

Through  U  draw  a  line  perpendicular^to  U V  and  lay  off  on  it 
line  segments  AU  =  UD  (Fig.  113).  Denote  by  B  and  C  the  points 
of  intersection  of  DM  and  AM  with  the  perpendicular  to  UV  that 
passes  through  V.  Then  lines  DC  and  AB  will  clearly  intersect  UV 
in  one  point,  which  is  the  desired  point  (by  Theorem  4). 

§  6.  Projective  classification  of  quadric  hypersurfaces 

I.  Theorem.  For  two  quaclric  hypersurfaces  in  a  real  space  Pn 
to  be  projectively  equivalent,  it  is  necessary  and  sufficient  that  the 
left  members  of  their  equations  have  the  same  ranks  and  equal  (in 
absolute  value)  signatures. 

Proof.  Given  the  hypersurfaces 

a  (6.6)  =  0,  b(l,l)  =  0  (1) 

where  a(g,  g)  and  f>(g,  g)  are  quadratic  forms  in  the  homogeneous 
coordinates  (g . .  g„+,)  =  g. 

The  geometric  meaning  of  each  of  the  equations  (1)  remains 
unaltered  if  both  sides  of  the  equation  are  multiplied  by  —1.  We 
can  therefore  assume  that  the  canonical  form  of  each  of  the  quad¬ 
ratic  forms  does  not  contain  more  negative  terms  than  positive 
terms.  Then  the  signature  is  positive  and  the  equality  of  ranks  and 
signatures  of  the  two  quadratic  forms  is  equivalent  to  the  equality 
of  their  ranks  and  the  positive  indices.  Taking  this  argument  into 
account,  let  us  prove  first  the  sufficiency  and  then  the  necessity. 

(1)  Let  the  quadratic  forms  a(g,  g)  and  b( g,  g)  have  one  and 
the  same  rank  r  and  the  same  positive  index  k. 

We  consider  the  quadratic  form 

c(l,  6)  =  6?+  ...  +g*-g*+i-  ...  —  fr  (2) 

and,  together  with  it,  the  third  hypersurface  c( g,  g)  =  0. 

We  know  that  there  exists  a  nonsingular  linear  transformation 
of  the  variables  which  carries  the  form  a(g,  g)  into  a  form  of  type 
(2).  This  means  that  there  is  a  projective  transformation  which 
carries  the  hypersurface  a(g,  g)=0  into  the  hypersurface 
e(g,  g)=  0,  that  is  to  say,  the  indicated  hypersurfaces  are  projec¬ 
tively  equivalent.  In  the  same  way,  the  hypersurface  b( g,  g)=  0  is 
projectively  equivalent  to  the  hypersurface  c(g,  g)  =  0.  Hence,  the 
hypersurfaces  (1)  are  projectively  equivalent. 

(2)  Let  a(g,  g)=  0  arid  b( g,  g)  =  0  be  projectively  equivalent. 
This  means  that  there  is  a  linear  transformation  of  the  variables  g 
into  the  variables  q  which  carries  the  form  a(g,  g)  into  the  quad¬ 
ratic  form  b(q,  q).  But  then  the  ranks  and  the  positive  indices  of 
the  quadratic  forms  a(g,  g)  and  b( g,  g)  are  the  same.  The  proof 
is  complete. 


464 


PROJECTIVE  SPACE 


[CH.  XII 


Remark.  In  an  n-diinensional  complex  projective  space,  the  hy¬ 
persurfaces  a (|,  £)  =  0  and  b(l,  |)=  0  are  projectively  equivalent 
if  and  only  if  the  quadratic  forms  a( g,  g)  and  6(g,  g)  have  the  same 
rank  r.  The  proof  is  analogous  to  the  preceding. 

2.  Definition.  A  quadric  hypersurface  a  (l,  g)=0  in  n-dimen- 
sional  projective  space  is  said  to  be  nondegenerate  if  the  quadratic 
form  a(g,  g)  is  nonsingular,  that  is  if  its  rank  r  =  n  -f  1. 

Remark.  This  definition  is  in  accord  with  the  terminology  of  the 
preceding  chapter. 

3.  In  a  real  space  P„,  every  nondegenerate  hypersurface  is  pro¬ 
jectively  equivalent  to  one  of  the  hypersurfaces  of  the  type 

I?  +  •••  +|fc  —  lfe+l  —  •••  — !n+l=0 

where  -f  1.  Therefore,  when  n  is  even  in  Pn,  there 

are  ~  +  1  projectively  distinct  nondegenerate  quadric  hypersur¬ 
faces,  when  n  is  odd,  there  are  -j(n  +  3)  such  hypersurfaces. 

4.  In  two-dimensional  projective  space  (the  projective  plane), 
there  are  (in  the  real  case)  two  projectively  distinct  nondegenerate 
quadric  hypersurfaces,  which,  incidentally,  it  would  be  more  natu¬ 
ral  in  this  case  to  call  curves  (which  is  precisely  what  is  done), 
namely: 

(1)  the  curve 

I?  +  |2  +  |3  =  0 

which  has  no  points  at  all  in  the  real  plane  and  is  therefore  called 
the  zero  curve; 

(2)  the  curve 

lf  +  |I-|3  =  0  (3) 

which  has  real  points  and  is  called  an  oval  curve. 

5.  We  assume  that  the  projective  plane  is  obtained  from  the 
ordinary  plane  by  adjoining  the  ideal  line  g3  =  0.  The  line  g3  =  0 
does  not  intersect  the  curve  (3)  and  in  equation  (3)  we  can  pass  to 

nonhomogeneous  coordinates  x=  4*-,  y  —  -I2- to  get  the  ellipse 

§3  S3 

*2+i/2=l 

6.  Now  suppose  that  the  straight  line  £2  =  0  is  an  ideal  line. 
It  intersects  the  curve  (3)  in  two  distinct  real  points  (±X,  0,  Jl). 


i  «J  PROJECTIVE  CLASSIFICATION  OF  HYPERSURFACES  455 

Discarding  them,  we  now  pass  to  the  nonhomogeneous  coordinates 

t  £  ^  J 

a:  =  ,  y  —  to  obtain  the  hyperbola 

x2-if=  1 

7.  On  the  projective  plane,  make  the  following  transformation  of 
homogeneous  coordinates: 

ili  =  Si. 

tfe  =  ~  £2  +  £3. 

ha  —  £2  +  £3 

Then  (3)  becomes 

’I  i  “  ^2^3  =  0  (3a) 

We  assume  the  line  %  =  0  to  be  an  ideal  line.  It  intersects  the 
curve  (3a)  in  the  double  point  (0,  X,  0).  Discarding  it,  we  put 

x  =  ,  y  —  to  get  the  hyperbola  y  =  x2. 

8.  Thus,  the  affinely  distinct  ellipse,  hyperbola  and  parabola  are 
obtained  from  one  and  the  same  oval  curve,  depending  on  how  it 
is  located  relative  to  the  straight  line  which  is  (or  is  assumed  to 
be)  at  infinity. 

There  is  no  such  distinction  in  those  models  of  the  projective 
plane  where  the  line  at  infinity  is  not  indicated.  For  instance,  if 
for  P2  we  take  the  bundle  0  in  $3,  then  the  oval  curve  is  an  or¬ 
dinary  cone.  When  passing  from  the  bundle  to  the  plane  $2  in 
accordance  with  Subsection  2,  Section  3,  we  get  an  ellipse,  a  hyper¬ 
bola  or  a  parabola,  depending  on  the  position  of  the  plane  2I2  rela¬ 
tive  to  the  cone  at  hand  (see  Figs.  114,  115,  1 16). 

9.  By  Subsection  3,  there  are  three  distinct  quadric  nondegene¬ 
rate  surfaces  in  three-dimensional  real  projective  space.  They  are: 

(1)  £1  +£2 +  £3 +  £4  =  0,  the  zero  surface  (imaginary  ellipsoid) 
devoid  of  real  points. 

(2)  £1  *f  £2  -f  £3  —  £4  =  0,  an  oval  surface.  This  type  includes  the 
ellipsoid,  elliptic  paraboloid  and  the  hyperboloid  of  two  sheets.  The 
analogy  is  complete  with  the  oval  curve  considered  above. 

(3)  £?  +  £2  —  £3  —  £4  =  0,  a  toroidal  surface.  It  is  easy  to  verify 
by  an  appropriate  calculation  that  when  passing  to  affine  space  a 
toroidal  surface  turns  into  a  hyperboloid  of  one  sheet  or  a  hyper¬ 
bolic  paraboloid.  The  difference  is  that  a  hyperboloid  of  one  sheet 
intersects  the  ideal  plane  along  an  oval  curve,  while  the  hyperbolic 
paraboloid  intersects  the  ideal  plane  along  two  rectilinear  gene¬ 
rators.  The  torus  is  a  good  pictorial  model  of  a  toroidal  surface. 


Fig.  116 


S  6] 


PROTECTIVE  CLASSIFICATION  OE  IIYPERSURFACES 


457 


Without  dwelling  on  the  proof,  we  confine  ourselves  to  Fig.  117, 
in  which  to  the  parallels  a,  p,  y,  6  of  the  torus  there  correspond 
rectilinear  generators  of  the  toroidal  surface  (those  are  labelled 
with  the  same  letters);  to  the  meridians  /,  //,  ///,  IV,  V  of  the 
torus  correspond  oval  curves  on  the  toroidal  surface  that  are  la¬ 
belled  with  the  same  numbers.  The  infinitely  distant  oval  curve  of 
the  toroidal  surface  corresponds  to  a  single  meridian  ABCD  of  the 
torus.  It  is  depicted  as  two  copies,  although  one  should  imagine 
points  having  the  same  labels  as  being  identical. 


Fig.  117 


In  Subsection  9,  Section  6,  Chapter  XI,  we  noted  that  there  are 
two  distinct  types  of  real  quadric-  cones  in  four-dimensional  affine 
space.  One  of  them  represents  an  oval  surface,  the  other  a  toroidal 
surface  if  for  the  model  of  P3  we  consider  a  bundle  in  tjt4. 

10.  Also  note  that  when  considering  degenerate  surfaces  it  is 
necessary  to  bear  in  mind  that  in  projective  space  there  is  no 
longer  any  difference  between  cylinders  and  cones.  The  rectilinear 
generators  of  a  cylinder  that  are  parallel  from  the  affine  standpoint 
intersect  in  a  single  ideal  point.  For  example,  in  three-dimensional 
real  projective  space  the  equation  (3)  specifies  a  real  cone.  When 
passing  to  affine  space,  this  cone,  depending  on  the  position  of  the 
ideal  plane,  turns  into  one  of  the  following  four  surfaces  that  are 
distinct  in  the  affine  classification:  the  cone  x2  +  y2  —  z2  =  0,  an 
elliptic  cylinder,  a  parabolic  cylinder,  or  a  hyperbolic  cylinder. 


PROJECTIVE  SPACE 


(CH.  XII 


4ST 


If.  We  now  return  to  the  two-dimensional  case.  It  is  readily 
seen  that  when  passing  from  the  affine  plane  to  the  projective 
plane,  a  quadric  curve  is  supplemented  with  ideal  points  of  those 
straight  lines  (and  only  such  lines)  that  have  asymptotic  directions 
relative  to  the  curve  under  consideration  (Figs.  118,  119).  In 
Fig.  119  (as  in  Fig.  116)  we  have  a  parabola  and  a  model  of  an 
oval  curve  in  the  form  of  a  cone  a  in  the  bundle  0.  To  the  para¬ 
bola  corresponds  a  cone  a  with  the  exception  of  the  single  recti- 


Fig.  119 


linear  generator  a.  The  axis  of  the  parabola  and  all  straight  planes 
parallel  to  it  have  a  common  ideal  point  A,  to  which  corres¬ 
ponds,  in  the  bundle  0,  the  straight  line  a.  A  is  the  ideal  point  of 
the  parabola  under  consideration. 

12.  The  assertion  slated  in  Subsection  11  will  be  proved  at  once 
for  the  n-dimensional  case. 

If  we  assume  the  points  |„+i  =  0  to  be  ideal  points,  then  to  find 
all  the  ideal  points  lying  on  a  quadric  hypersurface  it  will  suffice 
to  put  £„+i  =  0  in  the  equation  of  the  hypersurface.  Returning  to 
equation  (6)  of  Section  1,  we  obtain,  for  £n+1  =  0, 

=  0  (4) 

Except  for  notation,  equation  (4)  is  the  same  as  equation  (8), 
Section  9,  Chapter  XI,  which  defines  the  coordinates  of  vectors 


fOLAftS 


459 


in 


having  asymptotic  directions  relative  to  the  hypersurface  at  hand. 
From  this  it  follows  that  the  straight  Asymptotic  directions  are 
precisely  those  straight  lines  on  which  the  ideal  points  of  the 
given  hypersurface  lie. 


§  7.  The  intersection  of  a  quadric  hypersurface 
and  a  straight  line.  Polars 


1.  Given  in  an  n-dimensional  projective  space  Pn  a  quadric 
hypersurface 

n+l 

{  «//*<£/ =  0  (a) 

and  a  straight  line 


+  (/  =  1,  ....  n+  1) 


(1) 


passing  through  any  two  points  U(ut,  ....  u„+i),  V(v . .  on+i) 

of  Pn.  To  find  the  points  of  intersection  of  line  (1)  with  hypersur¬ 
face  (a),  put  expressions  (1)  into  (a).  We  get  an  equation  of  the 
form 

/lp2  +  2Bpv  +  Cv2  =  0  (2) 

where 

n+l  n+l  n+l 

A—  z  auUiUf,  B=  X  C  —  Y,  a^ViV,  (3) 

(.  /=1  <,/= I  /./==! 


Equation  (2)  determines  the  desired  points  of  intersection. 

Let  us  investigate  (2).  The  solution  p  ==  v  =  0  yields  |i  = 
=  . . .  =  |„+i  =  0  so  that  no  point  in  P„  corresponds  to  it.  Only 
those  solutions  need  be  sought  for  which  p  and  v  do  not  vanish  at 
the  same  time.  The  following  three  cases  are  possible,  depending 
on  the  coefficients  A,  B,  C. 

(1)  AC  —  B2  ^  0.  We  will  show  that  in  this  case  there  are  two 
distinct  points  of  intersection. 

First  suppose  that  A  0.  Then  if  v  =  0,  it  follows  that  p  =  0. 
So  we  assume  that  v  0.  Dividing  (2)  by  v2,  we  get  for  the  ratio 
p/v  a  quadratic  equation,  which  has  two  distinct  roots.  Denote  them 
by  Xi  and  X2.  We  then  get  two  sets  of  solutions  of  (2) : 

p  =  A.,v,  p  =  A,2v  (4) 

where  v  is  a  free  unknown.  For  v  #  0,  we  get  from  (1)  and  (4) 
the  homogeneous  coordinates  of  two  distinct  points  of  intersection 
of  the  hypersurface  (a)  and  the  line  (1). 

If  under  these  circumstances  atJ,  u(,  v{  are  real,  but  AC  —  B2  < 
<  0,  then  for  {£,■}  we  have  complex  conjugate  values.  In  that  case, 
even  if  we  are  considering  a  real  space  P„,  we  say  that  the  straight 
line  (1)  intersects  the  hypersurface  (a)  in  two  complex  conjugate 
points. 


m 


PROJECTIVE  SPACE 


(CH.  Xtt 


Now  suppose  A  =  0.  Geometrically,  this  means  that  the  point  U 
lies  on  the  hypersurface  (a)  (see  the  first  of  the  equations  (3)).  To 
the  point  U  corresponds  the  solution  set  (p,  0)  of  (2).  From  the 
condition  AC  —  B2  — —  B2  0  it  follows  that  B  =£  0  and  so 
besides  U  there  is  another  point  of  intersection  whose  coordinates 
are  determined  from  (1)  provided  that  2 Bp  -f-  Cv  =  0. 

(2)  AC  —  B2  =  0  but  at  least  one  of  the  coefficients  A,  B,  C 
is  nonzero.  Arguing  as  before,  we  can  easily  verify  that  (2)  has 
two  solution  sets  that  have  merged  into  a  single  solution  set  either 
of  the  form  v  =  Xp,  X  0,  or  of  the  form  p  =  Xv,  X  =/=  0.  It  yields 
a  unique  point  belonging  to  the  line  (1)  and  to  the  hyperplane  (a). 

In  this  case  however  we  say  that  there  is  a  double  point  of  in¬ 
tersection.  If  such  a  point  is  not  a  singular  point  of  the  hypersur¬ 
face  (as,  for  instance,  the  vertex  of  a  cone),  then  line  (1)  is 
tangent  to  the  hypersurface  at  this  point. 

(3)  A  =  B  =  C  =  0.  Equation  (2)  becomes  an  identity.  This 
means  that  line  (1)  lies  entirely  in  the  hypersurface  (a),  that  is 
to  say,  it  is  its  rectilinear  generator. 

Remark.  In  contrast  to  affine  space,  in  projective  space  any 
quadric  hypersurface  intersects  with  any  straight  line  and  the  con¬ 
cept  of  asymptotic  direction  is  meaningless. 

2.  Let  us  replace  the  quadratic  form  in  the  left  member  of  the 
equation  by  the  polar  bilinear  form 

n  + 1 

£  atiUih  —  0  (5) 

l,  /=*1 

regarding  U\,  . . . ,  «„+ 1  as  the  coordinates  of  an  arbitrarily  chosen 
fixed  point  £/ e  P„,  |i,  ...,  £n+i  being  the  running  coordinates. 

Equation  (5)  determines  a  hyperplane,  with  the  exception  of  the 
special  case  where  the  coefficients  of  all  the  vanish,  that  is, 
when 


awu\ 

+  ••• 

4~  t2n  +  \  \un  +  l — 0.  1 

a\  n  +  \u\ 

+  ... 

+  fln  +  l  „  +  !«„  +  !  =0  ) 

But  the  homogeneous  coordinates  u\ . u„+ 1  do  not  vanish  si¬ 

multaneously  and  so  from  (6)  we  get 

det  11^11  =  0  (7) 

Then,  multiplying  (6)  by  uu  ....  «„+ 1  respectively  and  adding,  we 
obtain 

n  +  1 


(8) 


POLARS 


461 


*n 

The  relations  (7)  and  (8)  show  that  (S)  may  become  an  identity- 
only  when  the  hypersurface  (a)  is  degenerate  and  the  point  V 
belongs  to  (a). 

What  is  more,  not  only  docs  equation  (8)  hoi d  true  with  respect 
to  the  point  U,  but  every  one  of  equations  (6)  holds  as  well.  Such 
a  point  U  is  called  a  singular  point  of  the  hypersurface  (a).  Only 
degenerate  hypersurfaces  have  singular  points.  A  typical  instance 
of  a  singular  point  is  the  vertex  of  a  cone.  To  summarize,  then,  (5) 
becomes  an  identity  when  the  hypcrsurface  (a)  is  degenerate,  and 
point  U  belongs  to  it  and  is  its  singular  point.  In  all  other  cases 
(5)  determines  some  hyperplane. 

Definition.  The  hyperplane  (5)  is  called  the  polar  of  point  U  with 
respect  to  the  hypersurface  (a). 

From  the  definition  it  follows  directly  that  if  the  point  U  is  lo¬ 
cated  on  the  hypersurface  (a)  and  has  a  polar,  then  this  polar 
passes  through  IJ\  but  if  U  does  not  belong  to  (a),  then  the  polar 
of  the  point  U  does  not  pass  through  this  point. 

If  the  hyperplane  P  is  the  polar  of  point  U,  then  U  is  called 
the  pole  of  hyperplane  P  (with  respect  to  the  hypersurface  (a) 
under  consideration).  It  is  easy  to  demonstrate  that  an  arbitrary 
hyperplane  P  has  a  unique  pole  with  respect  to  any  nondegenerate 
quadric  hypersurface. 

Indeed,  suppose  we  have  the  hyperplane 

^lil  +  •••  +  Ai  +  l£rc  +  l  =0 

To  find  the  pole  of  this  hyperplane  we  obtain  from  (5)  the  system 


flllul 

+  •• 

“I”  ^n  +  !  l^n  +  l  ^1>  1 

0|n  +  l«l 

+  •• 

+  an  +  \  n  +  lun  +  1  =  An  +  \  > 

Since  the  given  quadric  hypcrsurface  is  nondegenerate,  we  have 
det  ||  ai;- 1|  =£  0.  Under  this  condition,  the  system  of  equations  (9) 
has  a  unique  solution  («i . «n+i),  which  is  what  we  sought. 

3.  The  foregoing  definition  of  a  polar  as  a  hyperplane  which  is 
given  for  the  point  U (uu  ....  u„+ ,)  by  equation  (5)  is  connected 
with  a  certain  system  of  projective  coordinates.  We  have  to  de¬ 
monstrate  that  this  definition  has  an  invariant  (geometric)  mean¬ 
ing,  that  is,  that  the  polar  of  the  given  point  U  with  respect  to  the 
given  quadric  hypersurface  does  not  depend  on  the  choice  of  the 
system  of  projective  coordinates. 

Suppose  we  are  passing  from  the  old  coordinates  to  the  new 
coordinates  g \  via  formulas  of  type  (1),  Subsection  4,  Section  2. 
Without  loss  of  generality,  we  can  assume  in  these  formulas  X=l. 
In  the  terminology  of  tensor  algebra,  formulas  of  this  type  define 


462 


PROJECTIVE  SPACE 


[CH.  XII 


a  contravariant  transformation  law.  In  particular,  the  old  coordi¬ 
nates  «,•  of  point  U  transform  by  these  formulas  to  the  new  coor¬ 
dinates  u'i  of  the  same  point.  Transforming  the  left  member  of  the 
equation  of  the  given  quadric  hyperplane,  we  get  the  identity 

n+ 1  n+l 

H  ani ih~  X!  anh%i 

l.  /=*!  i.  /“I 

as  a  consequence  of  which  the  new  coefficients  a'ij  are  expressed 
in  terms  of  the  old  coefficients  an  by  the  covariant  law.  But  then 

X]  a'i,Uit'i=  £  auUih  (10) 

i,  /=  i  (.  /=i 

since  the  complete  contraction  of  a  second-order  covariant  tensor 
with  two  contravariant  tensors  is  an  invariant.  From  (10)  it  is 

n+l  n+l 

evident  that  the  equations  XI  ai/«i|/  =  0  and  XI  <*(/«<!/ =  0 

i./=i  /./=  i 

hold  simultaneously  and,  consequently,  define  one  and  the  same 
plane.  This  proves  the  invariance  of  the  definition  of  the  polar, 
that  is,  the  independence  of  the  polar  of  any  choice  of  projective 
coordinates.  Now  note  that  formulas  of  type  (1),  Subsection  4  of 
Section  2,  may  be  regarded  from  another  viewpoint,  namely  as  for¬ 
mulas  of  a  projective  transformation.  Therefore  we  have  also  prov¬ 
ed  the  following  theorem. 

Theorem  1  (projective  invariance  of  a  polar).  If  under  a  projec¬ 
tive  transformation  the  hypersurface  (a)  goes  into  a  hypersurface 
(a'),  and  point  U  goes  into  point  U',  then  the  polar  of  U  with 
respect  to  (a)  goes  into  the  polar  of  U'  with  respect  to  (a7). 

4.  In  Section  5  we  defined  harmonic  sets  of  four  points.  For  what 
follows  we  will  have  to  extend  this  notion  to  the  case  where  the 
points  of  one  of  the  two  harmonically  divided  pairs  are  coincident. 

For  the  time  being  we  assume  that  a  projective  line  with  a  quad¬ 
ruple  of  points  M,  N,  U,  V  is  obtained  by  supplementing  an  affine 
line  with  the  ideal  point  U.  Then  (MNUV)  =  —1  if  V  is  the  mid¬ 
point  of  the  line  segment  MN  (see  Section  5,  Subsection  5).  Now 
let  point  N  tend  to  M  and  point  V  remain  the  fourth  harmonic  re¬ 
lative  to  the  ordered  triad  M,  N,  U.  Then  V  tends  to  M. 

On  this  basis  we  will  generally  assume  that  if  M  =  N,  then  the 
fourth  harmonic  point  V  relative  to  the  ordered  triad  M,  N,  U  coin¬ 
cides  with  Af  and  N,  and  we  will  then  write 

(MNUV)  —  (UVMN)  —  —  I 

5.  Assume  U  does  not  belong  to  the  hypersurface  (a)  and  also 
that  the  straight  line  a  passes  through  U. 


POLARS 


463 


S  71 


By  Subsection  1,  line  a  intersects  the  hypersurface  in  two  points 
M,  N  (which  are  distinct,  coincident  or  complex  conjugate). 

Definition.  Points  U  and  V  on  the  straight  line  a  are  situated 
harmonically  relative  to  the  hypersurface  (a)  if  ( UVMN)= — 1. 

Remark.  This  definition  can  also  he  used  when  the  space  is  real 
and  the  points  M,  N  are  complex  conjugate  points  (here  we  use 
the  terminology  of  Subsection  1,  Section  7).  Using  the  formulas  of 
Subsection  15,  Section  2,  we  can  prove  that  if  U  is  real  in  this 
case,  then  V  is  also  real. 

6.  Theorem  2.  If  point  U  does  not  belong  to  the  hypersurface 
(a),  then  the  polar  of  U  is  a  locus  of  all  points  V  such  that  the 
pairs  UV  are  situated  harmonically  relative  to  (a). 

Proof.  Let  V  be  the  fourth  harmonic  point  for  the  points  M,  N, 
U.  The  coordinates  of  an  arbitrary  point  on  line  a  (this  point  is 
distinct  from  U)  can  be  represented  as 

h  —  Xui-\-Vt  (11) 

(see  (1)  for  v  =f=  0,  X  =  p/v).  The  points  M  and  N  are  determined 
from  (11)  for  X  =  Xi,  X  =  fa,  where  Xi,  fa  are  the  roots  of  the 
quadratic  equation 

n  +  1  n  +  !  n  4- 1 

Y  anViVj  -f  2X  Y  a,/MiU/  +  A,2  Y  auutu,  =  0  (12) 

i.  /=!  i.  /—I  l.  /=1 

which  is  obtained  by  substituting  (11)  into  (a). 

Suppose  that  Xi  =£  fa.  Then  points  M  and  N  are  distinct  and  by 
virtue  of  the  choice  of  point  V  we  have 

{MNUV)  =  {UVMN)  =  ^  =  —  \  (13) 

Hence  Xi  +  fa  =  0  and  by  Vieta’s  theorem  (sum  and  product  for¬ 
mulas  of  the  roots  of  an  equation) 

n+  I 

!?.|a</“,0/==  0  ^ 

This  equation  shows  that  V  belongs  to  the  polar  of  the  point  U. 

If  Xi  =  fa,  then  M  =  N  and,  by  Subsection  4,  V  =  M  — 
from  (11)  we  find  that  A,i  =  X2  =  0  (these  equations  may  also  be 
obtained  from  (12)  if  we  take  into  account  that  V  lies  on  the  hyper¬ 
surface  (a)).  Thus  we  again  have 

^,4T2  =  0  (15) 

whence,  as  above,  we  get  (14). 

Now  let  V  belong  to  the  polar,  that  is,  (14)  holds  true.  From 
(12),  (14)  and  Vieta’s  theorem  follows  (15).  Now  there  are  two 
possibilities; 


PROJECTIVE  SPACE 


|CH.  XII 


46-1 

either  A.i  =  =  0  and  then  V  =  M  —  N  due  to  (11),  and 

(UVMN)  =  — 1  by  Subsection  4; 

or  A, i  =^=  A.2  and  (13)  is  applicable. 

The  proof  of  Theorem  2  is  complete. 

7.  Theorem  1  is  a  geometrically  obvious  consequence  of  Theo¬ 
rem  2.  To  illustrate  this  fact,  consider  the  case  where  point  U  does 
not  belong  to  hypersurface  (oc).  Then  in  order  to  construct  the 
polar  of  U  it  suffices  to  find  n  points  Vu  ,  V„  in  the  general 
position  and  such  that  all  pairs  UV {  are  located  harmonically  rela-  ; 
tive  to  (a).  The  hyperplane  passing  through  the  points  V\,  V%,  . . .  j 
....  Vn  will  be  the  polar  of  the  point  U.  Under  a  projective  trans-  j 
formation  it  will  go  into  the  polar  of  the  image  of  point  U  with  \ 


Fig.  120 


respect  to  the  image  of  the  hypersurface  (a)  because  of  the  pro¬ 
jective  invariance  of  the  cross  ratio. 

8.  Suppose  point  U  does  not  belong  to  the  hypersurface  (a)  and 
lies  on  a  certain  hyperplane  that  is  assumed  to  be  at  infinity.  Then 
from  Theorem  2  and  Subsection  5,  Section  5  (with  account  taken 
of  Section  10  of  Chapter  XI),  it  follows  that  the  polar  P  of  point  U 
is  a  diametral  hyperplane  conjugate  to  the  direction  of  the  parallel 
straight  lines  that  intersect  at  U  (Fig.  120). 

9.  Let  us  consider  the  special  case  where  the  hypersurface  has 
an  equation  like 

C161+  •••  +Cn+lin+l  =  0  (16) 

where  c,-  0  for  all  i  =  1,  . . . ,  n  +  1  and  for  the  point  U  we  take 

a  point  Aj  with  coordinates 

?/=l,  £i  =  0  for /  =5^  /  (17) 

where  /  is  a  fixed  number,  1  ^  ^  n  +  1. 

By  formula  (5)  the  hypersurface  lj  =  0  is  the  polar  of  point  Aj. 

Recall  that  the  choice  of  points  with  coordinates  like  (17)  and 
of  the  units  point  6(1,  ...,  1)  uniquely  determines  a  system  of  co¬ 
ordinates  in  P„  (Section  3,  Subsection  9).  In  the  given  case,  the 
set  of  points  At . An+l  has  the  following  property  that  describes 


POLARS 


465 


§?] 

the  specific  position  of  these  points  relative*to  the  hypersurface 
(16). 

Each  one  of  the  points  A ,  is  a  pole  of  the  hyperplane  passing 

through  the  remaining  points  At . Aj  i,  Aj+\ . /ln+]. 

Such  a  set  of  points  is  said  to  he  self-polar. 

It  can  be  demonstrated  that  this  property  is  not  only  necessary 
but  also  sufficient  for  the  equation  of  a  nondegenerate  quadric 


Fig.  121  Fig.  122 

hypersurface  to  assume  the  form  (16),  and  by  an  apt  choice  of  the 
units  point  we  can  have  |c,  |  =  1  for  i  =  1 . n  -j-  1. 

10.  We  now  consider  some  properties  of  polars. 

Theorem  3  (equality  principle  in  the  theory  of  polars).  If  a  point 
V  is  located  on  the  polar  of  a  point  U,  then  the  polar  of  V  passes 
through  U. 

Proof.  Theorem  3  follows  from  the  definition  of  a  polar  and  the 
symmetric  nature  of  the  matrix  ||  ai;- 1|  of  the  coefficients  of  equa¬ 
tion  (a). 

Theorem  4.  If  point  U  lies  on  the  hyperplane  (a)  and  has  po - 
lar  P,  then  every  straight  line  in  P  that  passes  through  U  is  tang - 
ent  to  this  hypersurface  at  U  and  is  possibly  its  rectilinear  gene¬ 
rator  (Fig.  121). 

Proof.  If  a  straight  line  of  type  (1)  does  not  have  any  common 
points  with  the  hypersurface  (a)  other  than  U,  then  it  is  a  tangent 
by  Subsections  1  and  2.  It  therefore  suffices  to  prove  that  if  on 


16-661 


46G 


PROJECTIVE  SPACE 


[CH.  XII 


line  (1)  there  is,  other  than  U,  a  point  V  belonging  both  to  the 
hypersurface  (a)  and  its  polar  (5),  then  line  (1)  lies  entirely  in 
the  hypersurface.  Let  the  points  U  and  V  belong  to  the  hypersur¬ 
face  (a): 

n  + 1  n  +  I 

Z  <»//«/«/ =  o,  £  a(/UiU/  =  0  (18) 

I.  /=!  /.  /=  I 

and  besides  let  V  lie  on  the  polar  of  U: 

n+  I 

Z  ailuivl  =  0  (19) 

«.  /=  i 

Using  (1),  (18),  (19)  and  (a),  we  find  that 

n+l  n  +  l 

;£  ia,/(ti«i  +  vt/,)  (p«/-f  vu/) 

n+I  n+l  n+l 

=  p2  Z  al/«i«/  +  2pv  X  a^tij-fv2  £  ^,0,0,  =  0 
(.  /=!  /.  /=!  (,  /=! 

for  arbitrary  p,  v,  that  is,  the  straight  line  UV  is  a  rectilinear  ge¬ 
nerator  of  the  hypersurface  (a).  The  proof  of  Theorem  4  is  com¬ 
plete. 

Corollary.  If  a  point  on  a  projective  plane  belongs  to  an  oval 
curve,  then  the  polar  of  this  point  is  tangent  to  the  oval  curve. 

11.  Let  us  consider  the  following  problem  as  an  appendix  to  the 
results  just  obtained. 

Given  on  a  two-dimensional  Euclidean  plane  an  ellipse  and  a 
point  U  exterior  to  the  ellipse.  It  is  required  to  construct  tangents 
to  the  ellipse  that  will  pass  through  U. 

Construction.  Through  U  draw  any  two  straight  lines,  each  of 
which  intersects  the  ellipse  in  two  distinct  (real)  points  A,  B  and 
C,  D,  respectively  (Fig.  122).  Let  Q  be  the  point  of  intersection  of 
the  lines  AD  and  BC,  and  let  P  be  the  point  of  intersection  of  AC 
and  BD\  K  and  L  are  points  of  intersection  of  the  ellipse  and  the 
straight  line  PQ.  Then  UK  and  UL  are  the  desired  tangents. 

The  proof  is  readily  carried  out  with  the  aid  of  Theorem  4  oi 
Subsection  9,  Section  5,  Theorems  2  and  3  of  this  section,  and  the 
corollary  stated  in  Subsection  10. 


Appendix  1 


PROOF  OF  THE  THEOREM 
ON  THE  CLASSIFICATION 
OF  LINEAR  QUANTITIES 


t.  The  assertion  expressed  in  Chapter  VI  at  the  end  of  Section  2  and  in 
Subsection  8  of  Section  3  and  left  without  proof  can  be  stated  in  the  form  of 
a  theorem  which  follows. 

THEOREM.  Let  G„  be  the  group  of  alt  real  nnnsingular  n  X  n  matrices  and 


f(P)  a  real  numerical  function  specified  on  G„.  Let  u  =  / ( P)  be  a  homomor¬ 
phism  of  the  group  Gn  into  the  group  ( under  multiplication)  of  all  real  numbers 
without  zero,  that  is 

/(PP')  =  /(P)/(P') 

(1) 

for  arbitrary  P,  P' s  Gn.  Then 

either  f  (P)  —  \  det  P  |°,  o  constant, 

(2) 

or  f  (P)  =  ±  |  det  P  |° 

(i) 

where  the  plus  sign  corresponds  to  the  case  det  P  >  0  and  the  minus  sign  to  the 
case  det  P  <  0. 

REMARK  1-  Both  functions  (2)  and  (3)  satisfy  the  condition  (1)  due  to 
the  familiar  property  of  the  determinant  of  a  product  of  matrices.  And  so  the 
essence  of  the  theorem  lies  in  the  guarantee  that  there  are  no  functions  satis¬ 
fying  condition  (1)  except  (2)  and  (3). 

REMARK  2.  In  the  statement  of  the  theorem  given  above  we  have  dropped 
the  function-theoretic  conditions.  We  will  prove  the  theorem  assuming  that 
the  function  f(P)  is  continuous  on  G„. 

REMARK  3.  From  now  on  we  can  forget  that  u  =  f(P)  is  a  homomorphism 
of  Gn  into  the  group  (under  multiplication)  of  the  real  numbers  without  zero 
and  only  require  the  observance  of  (1).  The  point  is  that  if  we  exclude  the 
unintersting  case  of  the  identity  f(P)  ss  0,  then  it  follows  of  itself  from  (I) 
that  f(P)  #  0  For  all  PeC,. 

Indeed,  let  us  assume  that  there  is  at  least  one  matrix  P0eG„  for  which 
HPo)  =  0.  Then  for  any  P  £=  G„  we  have  [  ( P )  =  f  (PP~')/(P0)  =0. 

From  this  we  obtain  an  important  corollary  to  condition  (1): 

F(E)-  1  (4) 

where  E  is  the  unit  matrix.  Equation  (4)  follows  from  the  relation  f(P)  — 
=  f (PE)  =f(P)l(E)  since /(P)  ¥=  0. 

From  (4)  we  get 

Mp-'J-mr1  w 

since  f(p-')f(P)  =  f(E)=\. 


16* 


APPENDIX 


463 


2.  PROOF.  First  for  n  —  1,  for  which  we  have 

i  (xy)  =  i(x)f(y)  (6) 

where  x,  y  e  R.  R  stands  for  the  real  line  with  zero  deleted. 

Note  several  simple  corollaries  of  (6). 

(I)  If  x  >  0,  then  f(x)  >  0.  True  enough,  for 

I  (x)  =  f  (V*)  /  (V* )  =  { f  (VF)}2  >  o 


(2)  If  there  is  a  number  x0  <  0  for  which  f(x0 )  <  0,  then  f(x)  <  0  for  any 
x  <  0  {x  e  R).  Indeed,  if  x  <  0,  then  /(x)/(x0)  =  l{xx0)  >  0. 

Under  this  same  assumption,  we  have  f(x)  =  —  /(|jc|)  for  x  <  0.  Indeed, 
due  to  (5)  we  find  {/(|x|)}-'  =  f(|*|-'),  whence  f(x)  :  nM)  =  /(— 1)  =  —  1, 
since  l(-\)f(—  1)  =  /( 1)  =  I  and  /(—I)  <  0. 

(3)  If  there  is  a  number  x0  <  0  for  which  f(x0)  >  0,  then  f(x)  >  0  for  all 
x  e  R,  and  always  f(x)  =  /( |x|). 

Because  of  properties  (1),  (2),  (3)  the  matter  reduces  to  considering  the  se¬ 
miaxis  x  >  0. 

(4)  For  every  rational  number  r  >  0  and  for  arbitrary  x  >  0, 

f  (*r)  =  If  (*)V  (7) 


True  because  if  r  —  n  ( n  natural),  then  by  (6) 

f(xn)  =  f(xx  ...  x)  =  r(x)f(  x)  ...  f(x)  =  (f  (x))n  (8) 


If  r  =  \/m  (m  natural),  then  due  to  (8)  {/(x'/m)}m  —  f(x),  whence 


f(x,,m)  =  (/(x)},/m  (9) 

From  (8)  and  (9)  we  get 


f  (xn,m)  =  {/  (*)}n/m 

which  proves  (7). 

Now  let  us  take  a  fixed  number  a,  a  >  0,  a  ^  I.  Put  b  —  f  (a).  Under  our 
hypothesis  b  >  0.  We  can  write  b  —  a°,  a  constant.  Let  x  be  any  positive 
number.  We  can  also  write  x  —  ak  and  assume  that  k  is  the  limit  of  a  se¬ 
quence  of  rational  numbers  r„: 


By  (7) 


k  =  lim  rn 

n-»+o o 

f(an)  =  U  (a)}'n 


whence  and  due  to  the  continuity  of  f(x)  we  have 
f  (x)  —  f  (a*)  —  /  (lim  </'*)  =  lim  f  (a^)  =  lim  { f  (a)}*™ 

=  [f  (a)}fe  =  bk  =  aok  =  (ak)a  = 


Thus  the  theorem  is  proved  for  n  =  I.  Namely,  either  f(x)  =  |x|°  for  every 
R  or  f(x)  —  ±|x|°,  where  the  plus  and  minus  correspond  to  the  cases 
x  >  0  and  x  <  0  (x  e  R) . 

3.  We  now  take  up  the  group  G„  for  any  n.  Assume  that  there  is  a  nume¬ 
rical  function  f(P),  P  e  G„,  that  satisfies  the  condition  (1). 

First  of  all  note  that  this  function  assumes  the  same  value  on  equivalent 
matrices.  Thus,  if 

P2  =  AP{A~l 

then  by  (1)  and  (5) 


f(P3)  =  f(A)f(P,)f(A-‘)  =  f(P,) 


CLASSIFICATION  OF  LINFAR  QUANTITIES 


469 


We  can  therefore  assume  that  the  function  /(/B  at  hand  is  defined  on  the  set 
of  all  nonsingular  linear  transformations  of  an  n-dimcnsional  real  linear  space 
in-  Here  the  symbol  P  is  to  he  understood  either  as  the  designation  of-  a  14* 
near  transformation  or  as  that  of  its  matrix  (relative  to  any  basis). 

We  introduce  a  Euclidean  metric  into  the  space  L'„.  Then  for  any  P  w£ 
have 

P  -  JII  (10) 


where  I  is  an  isometric  linear  transformation  and  R  is  a  self-adjoint  transfor¬ 
mation  (see  Chapter  IX).  We  can  say  that  det  J>  0  (and  hence,  (let  J  — 
—  +•)• 

Consider  the  subgroup  /(/)  of  the  group  G„  made  up  of  matrices  of  type 


/«)  = 


1 

0 

’  (*) 

1 

0 

1 

.  (k) 


cos  t  —  sin  I 
sin  t  cos  t 


(11) 


assuming  that  the  submatrix  (k)  occupies  a  fixed  place  on  the  diagonal.  Let  us 
verify  that  /(/(/))  =  1  for  arbitrary  t.  True  enough,  for  the  function  u  = 
=  /(/(/))  maps  the  interval  0  sg  t  sg  2ji  onto  an  interval  a  u  <  b,  (a,  6]  cr 
cr  R,  and  since  f(j( 0))  =  f(E)  —  1,  it  follows  that  0  <  a  <  I  sg:  b  (take  note 
of  Remark  3  of  Subsection  1),  whence  if  there  is  a  value  of  /  for  which 
f  (/(/))  #  1,  then  a  <  b.  Besides,  since  j{t)  is  a  subgroup,  then  every  element 
has  an  inverse.  Therefore  from  (5)  it  follows  that  a  <  1  <  b.  Let  /  be  an  ele¬ 
ment  j(t)  such  that  i(i)  =  b.  We  have  /(/*)  =  b1  >  b,  which  is  impossible 
since  p  belongs  to  /(/)  and  b  is  the  maximum  value  of  We  divide  the 

remainder  of  the  proof  into  two  parts. 

(1)  By  Chapter  IX  (with  account  taken  of  the  fact  that  det  J  —  +1),  mat¬ 
rix  J  can  be  represented  as  J  —  j\ . /«,  where  / 1,  ....  j,  are  matrices  of 

type  (11)  with  different  positions  of  the  submatrix  (k).  From  this  and  on  the 
basis  of  the  foregoing, 

f(')  =  f(/,)  •••  /</,)-!  (12) 

(2)  Set 


1 

0 

M*)  = 

X 

0 

i 

where  x  is  in  the  fcth  place  on  the  diagonal.  For  a  given  x  and  for  distinct  k, 
expression  (13)  yields  equivalent  matrices.  Hence  the  function 

T  (*)  =  /(M*)) 


470 


APPENDIX  1 


doea  not  depend  on  k.  Also  note  that  bl(x)b]{y)  =  bl(xy),  whence  and  on 
the  basis  of  (1)  we  find  cp (xy)  =  q)(x)<p(i /),  where  x.y^R.  Therefore  and  on 
the  basis  of  Subsection  2  we  have:  either  <p(x)  =  |x|0,  a  constant,  or  q>(x)  = 

±|x|°,  where  the  plus  and  minus  signs  correspond  to  the  cases  x  >  0  and 
*  <  0,  respectively. 

By  Chapter  IX,  a  linear  transformation  B  relative  to  some  basis  is  represen¬ 
ted  by  a  diagonal  matrix:  let  Xi,  ....  X„  be  numbers  in  the  diagonal  of  this 
matrix. 

We  then  have  B  =  6i (Xi)  ...  6„(Xn)  whence  either 

l(B)  =  <f(Xl)  •••  <p(*it)  -l*i  •••  *»P  (14) 

f  (B)  —  <p  (X|)  ...  «p(X„)=  ±  |  A.,  ...  K\a  (15) 

We  have  (15)  if  <p(x)  =  ±M°.  Then  in  (15)  the  minus  sign  holds  if 

among  the  numbers  X . X„  there  is  an  odd  number  of  negative  numbers, 

that  is,  if  X,  ...  X„  <  0.  But  X,  . . .  X„  —  det  B  —  det  P.  From  this  and  from 
(10),  (12),  (14),  (15)  we  get  the  assertion  of  the  theorem. 


Appendix  2 


_ j _ 

HERMITIAN  FORMS.  UNITARY  SPACE 


1.  Let  £  be  a  complex  linear  space.  Besides  the  linear  functions  that  have 
been  studied  (Section  1,  Chapter  IV),  in  L  we  can  consider  so-called  linear 
functions  of  the  second  kind  defined  by  the  following  axioms: 

(1)  b(x-\-y)  —  b(x)-\-b(y)  for  any  vectors  x,  u,  in 

(2)  b( ax)  =  ab(x)  for  any  vector  x  in  L  and  any  complex  number  a. 

In  contrast  to  these  functions,  the  ordinary  linear  functions  are  called  li¬ 
near  functions  of  the  first  kind. 

There  is  no  necessity  to  construct  a  separate  theory  of  linear  functions  of 
the  second  kind,  for  if  a(x)  is  a  linear  function  of  the  first  kind,  then  b(x)  = 
=  a(x)  is  a  linear  function  of  the  second  kind;  if  b(x)  is  a  linear  function 
of  the  second  kind,  then  a(x)  =  b(x)  is  a  linear  function  of  the  first  kind.  Ho¬ 
wever,  the  existence  of  two  types  of  linear  functions  (forms)  implies  the  exis¬ 
tence  of  distinct  types  of  multilinear  forms,  namely,  such  as  are  linear  of  the 
first  kind  in  a  certain  set  of  its  arguments  and  linear  ol  the  second  kind  in 
the  remaining  arguments  In  particular  we  have  four  types  of  bilinear  forms: 

(1)  linear  of  the  first  kind  in  each  of  the  arguments  (they  were  investigated 
in  Chapter  IV); 

(2)  linear  of  the  first  kind  in  the  first  argument  and  of  the  second  kind  in 
the  second  argument; 

(3)  linear  of  the  second  kind  in  the  first  argument  and  of  the  first  kind  in 
the  second  argument; 

(4)  linear  of  the  second  kind  in  both  arguments. 

It  is  readily  seen  that  the  fourth  type  is  obtained  from  the  first  by  complex 
conjugation,  and  the  third  is  obtained  from  the  second. 

Our  subject  here  will  be  a  certain  class  of  bilinear  forms  of  the  second  type 
that  have  important  applications  (in  particular  in  the  theory  of  functions  of 
complex  variables  and  in  quantum  physics)  and  also  related  questions  of  geo¬ 
metry. 

DEFINITION.  A  bilinear  form  of  the  second  type  a(x,y)  is  said  to  be  Her- 
mitian  if  a(y,x)  =  a(x,y )  lor  any  vectors  x,  y  in  L  (the  bar  over  the  complex 
number  indicates,  as  usual,  conjugation). 

Thus,  the  function  a(x,y)  is  called  a  bilinear  Hermitian  form  if 

(1)  a(x\  -+-  x2,  y)  —  a(x\,  y)  +  a(x2,  y) 
for  arbitrary  vectors  xu  x2,  y  in  L\ 

(2)  a(ax,y)  —  a a(x,y)  for  arbitrary  vectors  x,  y  in  L  and  for  any  comp¬ 
lex  number  a;  _ 

(3)  a(y,x)  =  a(x,y)  for  arbitrary  vectors  x,  y  in  L. 


472 


APPENDIX  2 


EXAMPLE.  Let  be  a  linear  space  of  continuous  complex-valued  functions 
specified  on  the  interval  fi  t  ^  t2  of  the  real  axis.  Set 

.  q(x,  y)  =  ^  x  (t)  y  (0  dt 

t 7 


Then  the  function  a(x,y )  is  a  bilinear  Hermitian  form. 


2.  Let  a(x,y)  be  a  bilinear  Herminatian  form.  Then  the  function  f(x)  = 
—  a(x,x)  is  a  quadratic  Hermitian  form,  or  simply  a  Hermitian  form. 

The  original  bilinear  Hermitian  form  a(x,y)  is  said  to  be  the  polar  of  the 
Hermitian  form  f(x)  =  a(x,x). 

From  the  definition  it  follows  directly  that  the  (quadratic)  Hermitian  form 
assumes  only  real  values:  /(a)  =  f(x). 

A  Hermitian  form  f(x)  is  said  to  be  nonnegative  (nonpositive)  if  f(x)  >  0 
(f(x)  ^  0)  for  any  x  in  L,  and  positive  definite  ( negative  definite)  if  f(x)  >0 
(f(x)  <  0)  for  any  x  0. 

THEOREM  I.  A  Hermitian  form  f(x)  uniquely  defines  the  polar  of  the  bi¬ 
linear  form  a(x,  y). 

PROOF.  We  have  to  express  an  unknown  bilinear  Hermitian  form  a(x,y)  in 
terms  of  a  given  function  f(x)  using  the  fact  that  a(x,x)  =f(x ).  We  have 

f(x  +  y)  =  a(x  +  y,  x  +  y)  =  a(x,  x)  +  a(y,  y)  +  a(x,  y)+a(y,  x) 

=  /(*)  +  /  (y)  +  2  Re  a  (x,  y) 

whence. 

Rea(x,  £)  =  -£  U(x  +  y)-f(x)-f(y)]  (1) 

Furthemore 

f(x  +  iy)  —  a(x  +  iy,  x  +  ty)  =  a(x,  x)  +  a(iy.  iy)  +  a(x,  iy)  +  a(ly,  x) 


=  a  (x,  x)  +  a  (y,  y)  —  ia  (x,  y)  +  ia  (x,  y)  =  f(x)+  f  (y)  +  2  Im  a  (x,  y) 


whence 

Im  a  (x,  y)  —  y  [f  (x  +  iy)  -  f  (x)  -  f  (y)]  (2) 


From  (1)  and  (2)  we  get 

a(x,y)  =  jU(x  +  y)  +  if(x  +  ty)-(\+i)[f(x)  +  f(y)])  (3) 


which  proves  the  theorem. 

REMARK.  Formula  (3)  may  be  given  a  more  symmetric  notation  by  repla¬ 
cing  y  by  (— y)  and  subtracting  the  resulting  equation  from  (3)  termwise. 
We  then  nave 

a  (*>  y)  =  \  (/  (*  +  y)  -f{x-  y)  +  i  1/  (A  +  iy)  - 1  (X  -  iy)])  (4) 


HERMITIAN  FORMS.  UNITARY  SPACE 


473 


Note  the  difference  between  the  formulas  (3),  ft)  and  the  corresponding  for¬ 
mula  (1)  of  Section  4,  Chapter  IV,  and  the  resemblance  between  the  latter 
formula  and  formula  (1)  of  this  section. 

3.  Now  suppose  that  the  complex  space  L  is  n-diniensional  and  that  «, . 

is  a  basis  in  it.  Expanding  the  vectors  x,  y  with  respect  to  this  basis 

=  y~'^jykBk)  and  using  the  definition  of  the  bilinear  Hermitian 

form  a(x,y),  we  get  the  coordinate  representation: 

a(x>y)  =  Tdan,xlyk  (5> 

where 

alk  =  a  ('■/■  ek)  =  %l  (®) 

At  the  same  time, 

f(x)^a(x,x)  =  Yja;kx'xk  (?) 

The  matrix  A  =  'layall  is  called  the  matrix  of  the  Hermitian  form  (7)  and 
its  polar  bilinear  form  (5)  in  the  given  basis.  In  matrix  notation,  formula  (5) 
looks  like  this: 

a(x,  y)  =  x'Ay  (5:i) 

Here,  x  and  y  are  column  matrices  (n  X  I  matrices).  As  usual,  the  star  deno¬ 
tes  the  operation  of  transposition  of  the  matrix.  The  bar  on  the  matrix  y  means 
that  all  the  elements  are  to  be  replaced  by  complex-conjugate  numbers. 

The  rank  of  a  Hermitian  form  is  the  rank  of  its  matrix.  A  Hermitian  form  is 
said  to  be  nonsingular  if  its  rank  is  equal  to  n.  The  invariance  of  rank  rela¬ 
tive  to  choice  of  basis  will  be  proved  in  the  next  subsection. 

We  note  in  passing  that  the  complex  n  X  n  matrix  A  =  ||ajft||  is  said  to  be 
Hermitian  if  it  satisfies  the  condition  ahj  —  ajk  or,  in  matrix  notation. 

A*  =  A  (8) 

From  (8)  it  follows  that  the  determinant  of  any  Hermitian  matrix  is  real: 

(det  A)  =  del  X=  det  A‘  =  det  A  (9) 

From  (8)  it  also  follows  that  matrix  A  is  symmetric  if  and  only  if  it  is  Her¬ 
mitian  and  real. 

4.  Let  us  determine  the  law  of  transformation  of  coefficients  of  a  Hermitian 
form  under  a  transformation  of  the  basis.  We  pass  from  the  original  basis 
d,  . . . ,  e„  to  the  new  basis: 

er  =  X  P're,  (10) 

From  (6)  and  (10)  we  find 

= « (E  plret’  E  pk'ek)  =  E  p'-pI  a  («/.  •*)  “  E  aikP'gPk' 

Thus  _ 

at'k'  =  E  alkpli'pk'  (U) 

In  matrix  notation  (11)  becomes 

A'  =  PAF  (12) 

From  (12)  follows  directly  the  invariance  of  the  rank  of  a  Hermitian  form, 
since  det  P  0  and  det  P  —  (det  P)  #  0. 


474 


APPENDIX  2 


5.  We  now  show  that  the  law  of  transformation  of  coefficients  given  by  for¬ 
mula  (II)  guarantees  the  invariance  of  the  bilinear  Hermitian  form. 

Let  the  function  a(x,y),  relative  to  a  basis  e\ . cn,  be  given  by  for¬ 

mula  (5);  when  passing  to  a  new  basis  (10)  let  its  coefficients  transform  by 
formula  (11).  We  now  verify  that  the  numerical  value  of  a(x,y)  on  an  ar¬ 
bitrary  pair  of  vectors  x,  y  is  preserved.  It  will  be  convenient  to  use  the  mat¬ 
rix  formulas  (5a)  and  (12)  in  addition  to  the  coordinate  notation.  Let 

a  (x,  y)  =  Yj  ai'k'xl'yk'  =  (*T  A'  (/) 

By  Section  5,  Chapter  II,  x  =  PV,  y  =  P’y'  and  so 

a  (x,  y)  =  x'Ay  =  (PV)*  A  (PY)  =  (*')'  PAP'y'  =  (x')‘  AY  =  a'  (x,  y) 
which  is  what  we  set  out  to  prove. 

Also  note  that  for  the  function  a(x,y)  (5)  ensures  linearity  of  the  first  kind 
in  the  first  argument  and  linearity  of  the  second  kind  in  the  second  argument, 
and  if  =  aj*.  then  a(y,x)  —  a(x,y)  Thus,  formula  (5),  with  the  foregoing 
circumstances  taken  into  account,  yields  the  general  aspect  of  bilinear  Hermi¬ 
tian  forms  in  rt-dimensional  complex  space. 

6.  Suppose,  relative  to  a  certain  basis,  the  matrix  of  the  Hermitian  form 
f(x)  is  such  that  ajk  =  0  when  j  ¥=  k.  Then  we  say  that  the  Hermitian  form 
f  (x)  in  this  basis  is  of  canonical  form: 

f(x)  =  a„x'?"+a2  2x2F  +  ...  +annxnlTn  (13) 

Note  that  all  the  coefficients  fljj  are  real ,  since,  generally,  am  =  ajk  (in  any 
basis). 

Repeating  the  arguments  and  computations  of  Sections  5  to  9,  Chapter  IV, 
almost  word  for  word,  we  establish  the  following. 

(1)  A  Hermitian  form  can  be  brought  to  canonical  form  by  a  nonsingular 
transformation  of  the  variables,  for  instance,  via  the  Lagrange  method. 

REMARK.  Bear  in  mind  that  here  we  have  to  do  with  the  quantities  x>x> 
instead  of  the  squares  (x>)2;  we  refer  the  reader  to  the  example  in  Subsection 
7  below. 

(2)  If  all  the  principal  minors  of  the  matrix  of  a  Hermitian  form  are  other 
than  zero,  then  reduction  to  canonical  form  can  be  carried  out  via  the  Jacobi 
method. 

REMARK.  The  principal  minors  of  a  Hermitian  matrix  are  always  real  be¬ 
cause  of  (9),  since  each  is  itself  a  determinant  of  some  Hermitian  matrix. 

(3)  For  each  Hermitian  form  / (x)  the  number  of  positive,  negative  and  zero 
coefficients  a^  in  its  canonical  form  (13)  is  independent  of  the  choice  of  the 
basis  that  gives  the  form  its  canonical  form  (the  law  of  inertia  of  Hermitian 
forms). 

(4)  A  Hermitian  form  is  positive  definite  if  and  only  if  all  the  principal  mi¬ 
nors  of  its  matrix  are  positive  (Sylvester's  criterion  for  Hermitian  forms). 

7.  EXAMPLE.  Let  us  reduce  the  Hermitian  form 

f  =  (l +3i)x'^  +  (l  -3/)x**'  (14) 

to  canonical  form  via  the  Lagrange  method.  Since  oM  =  a2 2  =  0,  we  first  apply 
the  transformation 

x>=z'  +  22  ■» 

X2=2‘  -Z2  J 


(15) 


HERMITIAN  FORMS.  UNITARY  SPACE 


47S 


in  order  to  obtain  at  least  a  nonzero  coefficient^jn  the  principal  diagonal  of  the 
matrix  of  the  form  f.  Substituting  (15)  into  (14)  and  regrouping,  we  get 

f  =  2  z'z'  —  6  iz'z2  +  6  iz2z'  —  2  z2z2  (16) 


We  are  now  in  a  situation  that  corresponds  lo  the  first  case  of  Seciton  5,  Chap¬ 
ter  IV.  Set 


y'  =  2 z'  +  <>/za  ) 
y2=  22  1 


(17) 


Then  the  Hermitian  formg  =  / — (*ot‘s  not  conlain  y'  an(l  therefore 
depends  solely  on  y 2,  namely, 

g  =  /  -  \  y' V  =  -  20  z2T2  =  -  20yv 


because  of  (16)  and  (17).  Yet 

f  =\y'7-^yW  (18) 

Now,  expressing  z 1  and  z2  from  (15)  and  substituting  them  into  (17),  we  get 
V1  =  (1  +  3i)  x1  +  (1  —  3i)  x2,  •) 


Thus,  the  Hermitian  form  (14)  is  reduced  to  canonical  form  (18)  by  a  nonsin¬ 
gular  linear  transformation  of  the  variables  (19).  The  corresponding  transfor¬ 
mation  of  a  basis  in  two-dimensional  complex  space  can  be  written  out  in  ac¬ 
cordance  with  Section  5  of  Chapter  II. 

8.  Let  a  Hermitian  form  f{x)  be  reduced  to  the  canonical  form  (13).  Then, 
putting  x<  —  -\J |  af/ 1  x>  if  an  0,  and  x1  —  x>  if  a j}  =  0.  we  reduce  /  to  nor¬ 
mal  form  via  a  nonsingular  transformation  of  the  variables.  Dropping  the  tilde 
we  can  write  it  thus: 

/(x)  =  £e/7==£e/U/P  (20) 


where  the  e*  are  equal  to  +  1  or  zero. 

9.  Let  there  be  given  in  a  linear  space  L  (which  may  be  complex  but  not 
necessarily  finite-dimensional)  a  positive  definite  Hermitian  form  g(x).  Let  us 
consider  the  polar  bilinear  Hermitian  form  a(x,y)  and  call  its  value  on  an  ar¬ 
bitrary  pair  of  vectors  x,  y  their  scalar  product: 

(x,y)  =  a(x.y)  (21) 

Accordingly,  we  define  the  norm  of  a  vector  ||x||  =  Vfx,  x)  =  Vg  (x)  and  also 
introduce  the  concept  of  orthogonality  by  regarding  the  vectors  x  and  y  as  or¬ 
thogonal  if  and  only  if  their  scalar  product  is  zero:  (x,  y)  =0. 

DEFINITION.  A  space  L  with  a  specified  scalar  product  (21)  is  said  to  be 
unitary  and  the  Hermitian  form  g(x)  is  called  the  metric  form  of  that  space. 
We  also  say  that  a  unitary  metric  is  introduced  in  the  space  L. 

The  norm  of  an  arbitrary  vector  x  in  unitary  space  is  a  nonnegative  real 
number  (||x||  ^  0)  which,  due  to  the  positive  definiteness  of  the  metric  form 


476 


APPENDIX  2 


g(x)  is  equal  to  zero  if  and  only  if  x  is  a  zero  vector.  Here,  ||ox||  =  V(cu,  ax)  = 
=  i/aag  (x)  =  I  a  I  •  II* II  for  any  scalar  a  and  any  vector  x.  In  Subsection  12 
below  we  will  see  that  the  triangle  inequality  also  holds. 

By  Theorem  1,  the  specification  of  the  metric  form  g(x)  uniquely  defines  the 
scalar  product  of  any  pair  of  vectors.  The  properties  of  a  scalar  product  differ 
somewhat  from  the  real  case.  Indeed,  we  have 

(x,  +  *2.  if)  =  (*i.  y)  +  (x2,  y); 

(x,  y\  -V  y2)  =  (x,  yi)  +  (x,  y2 ); 

(ax,  y)  =  a(x,  y) 

but  (x,ay)  —  a  (x,y)  and  (y,x)  =  (x,y).  The  numerical  values  of  a  scalar 
product  are,  generally,  complex. 

The  orthogonality  of  a  vector  to  a  subspace,  the  orthogonality  of  subspaces, 
and  the  orthogonal  complement  are  defined  in  unitary  space  by  analogy  with 
Euclidean  space. 


Fig.  123  Fig.  124 

REMARK.  Occasionally,  the  requirement  of  positive  definiteness  of  the  met¬ 
ric  Hermitian  form  g(x)  is  dropped.  We  then  have  a  class  of  spaces  that  is 
more  general  than  the  class  of  unitary  spaces.  Such  spaces  are  called  spaces 
with  a  Hermitian  metric  We  shall  not  discuss  them. 

10.  EXAMPLE.  A  ONE-DIMENSIONAL  UNITARY  SPACE.  In  order  to  con¬ 
struct  this  space  we  have  to  specify  a  positive  definite  Hermitian  form  g(x)  in 
a  one-dimensional  linear  complex  space  L For  L,  we  take  a  coordinate  space 
which  may  be  imagined  as  the  ordinary  plane  of  a  complex  variable,  the  ele¬ 
ments  x,  y,  ...  being  complex  numbers.  The  matrix  of  the  desired  Hermitian 
form  g  contains  the  sole  element  an,  which  must  be  a  real  number  (a,,  =  a,,). 
By  Sylvester’s  criterion,  au  >0.  Suppose  we  choose  a,i  =a2,  a  >  0.  We  take 
the  positive  definite  Hermitian  form  g(x)  —  tx2xx  for  the  metric  form  of  L\. 
Then  ||.r||  =  a|x|  so  that  the  elements  of  l.\  with  the  given  norm  ||x||  fill  the 

circle  |x|=—  ||x||  on  the  complex  plane  (Fig.  123).  In  the  resulting  unitary 

space,  a  scalar  product  is  given  by 

(x,y)  =  a2xy  (22) 

To  determine  its  geometric  meaning,  we  put  x  =  u  +  iv,  y  —  |  +  ('q.  Then  (22) 
becomes 

(x,  y)  =  (i2 1(//|  +  oq)  —  i  “  ”  |  j.  =  a2  (t  —  io) 


(23) 


HERMITIAN  PORMS.  UNITARY  SPACE 


477 


Here,  t  denotes  the  scalar  product  of  the  radius  vectors  ox  and  oy  viewed  as 
vectors  of  a  Euclidean  plane,  and  a  is  the  oriented  area  of  a  parallelogram 
constructed  on  these  vectors  (Fig.  124).  From  (23)  it  is  evident  that  ( x,y )  =0 
if  and  only  if  t  =  a  —  0,  that  is  when  x  =  0  or  y  =  0.  It  is  therefore  quite 
evident  that  there  cannot  be  two  nonzero  orthogonal  vectors  in  a  one-dimen¬ 
sional  unitary  space.  At  the  same  time,  this  space  can  be  naturally  depicted  as 
a  two-dimensional  Euclidean  plane,  the  norms  of  the  elements  being  proportio¬ 
nal  or,  with  an  appropriate  choice  of  scale  (for  a  —  I),  equal  to  the  lengths 
of  the  corresponding  vectors  of  the  Euclidean  plane. 

It.  We  now  prove  that  in  every  finite-dimensional  unitary  space  there  is  an 
orthonormal  basis,  that  is,  a  basis  such  that  all  its  vectors  are  pairwise  ortho¬ 
gonal  and  have  unit  norms.  Such,  in  actuality,  is  any  basis  in  which  the  met¬ 
ric  form  g(x)  is  of  normal  form: 

g(x)=  ||xH2  =  *V  +  ...  +  xnxn  =  \  x1  P  (24) 


(see  (20)  for  e;  =  +1,  /  =  1,  . . . ,  n). 

As  in  the  case  of  a  Euclidean  space,  an  orthonormal  basis  is  not  defined  uni¬ 
quely  in  a  unitary  space.  Take  the  example  of  the  preceding  subsection;  for  the 
(sole)  vector  constituting  the  orthonormal  basis  we  can  take  (when  a  =  1) 
any  complex  number  with  unit  modulus  (ei  —  cos  (p  i  sin  rp). 

Because  of  (24),  the  scalar  product  in  any  orlhonormal  basis  of  a  unitary 
space  is  given  by 

(x,y)  =  x'y'  +  ...  +  xnyn  (25) 

Using  orthonormal  bases,  it  is  easy  to  verify  (as  was  done  in  Section  5  of 
Chapter  VII!)  that  unitary  spaces  of  the  same  dimension  are  metrically  iso¬ 
morphic. 

12.  To  get  a  better  picture  of  the  geometry  of  a  unitary  space  in  the  n-di- 
mensional  case,  we  compare  an  n-dimensional  complex  space  C„  with  a  real 
space  E2„  of  dimension  double  C„  (see  Section  11  of  Chapter  I).  Let  Cn  be  a 
unitary  space  and  let  e\,  ....  e„  be  an  orthonormal  basis  in  it.  Decompose  an 
arbitrary  vector  x  of  C„  in  terms  of  this  basis  and  separate  the  real  and  imagi¬ 
nary  parts  of  each  component  (coordinate)  xh:  xk  =  uk  - f-  (u*.  We  can  regard 
Cn  as  a  coordinate  space  and  write  down  its  elements  as 

X  =  {«'  -f  lvl,  u2  -f  iv2,  ...,  un  +  ivn) 

Besides  the  basis  elt  e„,  consider  the  vectors  =  ie*,  k  —  I,  ...,  n. 
Then 

x  =  u'e{  +  ...  +  unen  +  «'/,  +  ...  +onln 

Now  write  down  the  vector  x  as 

x  —  {u\  o',  u2,  v 2 . un,  v") 

viewing  it  as  an  element  of  a  real  coordinate  space  E2n.  Such  precisely  was 
the  comparison  dealt  with  in  Section  1 1  of  Chapter  I  (with  somewhat  different 
notation).  By  Section  11,  Chapter  I,  the  spaces  C„  and  Ein  are  isomorphic 
from  the  standpoint  of  the  operations  of  addition  and  multiplication  by  real 

factors.  We  now  assume  that  E2„  is  Euclidean  and  that  the  basis  ei,  e2,  l2, _ 

en,  U  is  orthonormai  in  it;  compare  a  scalar  product  in  the  spaces  C„  and 
E2n .  Besides  x,  take  another  arbitrary  vector  y  —  ^  ykek  e  Cn,  yk  =  +  iqfc 

and  again  consider  it  as  an  element  of  the  space  E2n: 

V.  !"•  T)2>  •••>  £">  T)n}  e  E2n 


478 


APPENDIX  2 


Denote  a  scalar  product  in  £2n  by  (x,y)  e.  We  then  have 

(x,  y)E  =  u'l'  +  o’r)1  +  k2£2  +  ®"t)2  +  ...  +ur+»Y  (26) 


On  the  other  hand,  express  as  follows  (via  (25))  the  scalar  product  (x,y)c  of 
the  same  vectors  x ,  y  (this  time,  however,  they  will  be  regarded  as  elements 
of  the  unitary  space  C„): 


(x,  y)c = x  = e  y + ivk)  ^ k  -  ‘n*) 


(27) 


Comparing  (26)  and  (27),  we  see  that  ( x,y)E  —  Re  ( x,y)c ,  whence  it  is  clear 
that  vectors  which  are  orthogonal  in  the  unitary  space  C„  are  also  orthogonal 
in  the  Euclidean  space  £2„.  The  converse  is  not  true:  vectors  orthogonal  in  £2n 
may  not  be  orthogonal  in  C„.  Such  for  example  is  the  case  if  Re(x,  y)c  =0 
but  \m(x,y)c  =  0. 

It  is  important  to  observe  that  despite  the  difference  between  the  scalar  pro¬ 
ducts  ( x,y)c  and  (x,y)E,  the  norm  Me  of  the  vector  x  in  the  unitary  space 
C„  coincides  with  its  Euclidean  norm  IU1|£: 

Me  =  E  (“*)2  +  E  y  )2  =  Me  (28) 


To  summarize,  then,  the  spaces  C„  and  £2„  are  isometric,  but  the  scalar 
product  in  the  unitary  space  C„  carries  more  information  concerning  the  fac¬ 
tors  because  of  the  imaginary  part.  One  should  therefore  expect  that  the  geo¬ 
metric  properties  of  a  unitary  space  will  prove  to  be  more  fragile  than  those 
of  the  Euclidean  space  isometric  to  it,  that  is,  that  the  group  preserving  their 
linear  transformations  is  narrower.  In  Subsection  16  it  will  become  evident 
that  this  is  precisely  the  case. 

Incidentally,  from  (28)  it  follows  that  in  a  unitary  space  we  have  the  trian¬ 
gle  inequality 

II*  +  y\\c  <  IMC  +  llifllc  (29) 


since  (29)  definitely  holds  in  Euclidean  space. 

Further  observe  that  the  even-dimensional  Euclidean  space  £2„  may  be  con¬ 
verted  in  the  following  manner  into  a  unitary  space  C„  of  half  the  dimension 

of  £2„  and  isometric  to  it.  In  £2„  take  an  orthonormai  basis  e\ . e„, 

. . .  and,  taking  half  the  vectors  of  this  basis,  say  the  vectors  eh  .... 

e„,  let  us  define  for  each  of  them  the  operation  of  multiplication  by  the 
imaginary  unit,  setting 

ie i—en  +  i,  ....  ien  =  e2„ 


Then  it  is  easy  to  see  that  for  any  vector  x  of  £2„  its  product  into  any  com¬ 
plex  number  will  be  defined  and  in  such  a  way  that  £2n  becomes  a  complex 
linear  space  of  dimension  n.  If,  as  before,  we  assume  the  vectors  eu  . . . ,  e„ 
to  be  orthonormal,  then  (24)  will  define  the  norm  of  any  vector,  which  means 
that  the  scalar  product  (25)  will  also  be  defined.  We  will  then  have  the  unitary 
space  C„  considered  above  (We  obtain  complete  equality  with  the  notation 
introduced  at  the  beginning  of  this  subsection  by  setting  /*  =  =  en+k, 

k  =  1 . n.) 

Thus,  up  to  an  isomorphism,  the  unitary  space  C„  is  precisely  the  real  Euc¬ 
lidean  space  £2„  equipped  with  certain  supplementary  properties. 

13.  Let  us  now  take  up  the  study  of  the  most  important  classes  of  linear 
transformations  in  unitary  spaces. 

Given,  in  a  unitary  space  L,  the  linear  transformation  A, 


HERMIT1  AN  FORMS.  UNITARY  SPACE 


479 


DEFINITION.  A  linear  transformation  A  is* said  to  be  the  conjugate  of  A 
if  (Ax,  z)  =  (x,  Az )  for  any  vectors  x,  y  in  L. 

• 

The  existence  and  uniqueness  of  the  conjugate  transformation  A  may  be  pro¬ 
ved  as  in  the  case  of  a  Euclidean  space  (see  pp.  308-309)  by  first  establishing 
the  existence  and  uniqueness  of  the  reciprocal  basis  (as  is  done  on  pp.  289- 
290).  We  may  also  proceed  somewhat  differently by  confining  ourselves  to  a 
consideration  of  the  transformations  in  an  orthonormnl  basis,  in  which  case  the 
reciprocal  basis  coincides  with  the  given  basis  and  the  reasoning  is  somewhat 
simplified. 

Let  A  be  the  matrix  of  the  given  transformation  y  —  Ax  in  an  orthonormal 
basis.  Then,  by  (25),  the  scalar  product  (y,z)  may  be  written  in  matrix  nota¬ 
tion  thus: 

(y,  z)  =  z'Ax  (30) 

where  x  and  z  are  column  matrices  (n  X  I  matrices).  Relative  to  the  same  ba¬ 
sis  let  us  consider  the  transformation  B  with  the  matrix  I*  obtained  from  mat¬ 
rix  A  via  a  transposition  and  a  replacement  oi  its  elements  by  complex  con¬ 
jugate  numbers  We  will  show  that  B  —  A.  As  in  (30),  we  compute  the  scalar 
product: 

(x,  Bz)  =  (Bz7x)=lx'A,z]=([x’A'z]  )'  =  \z'Ak\=z'Ax  =(y.  z) 

Thus,  in  an  orthonormal  basis  the  matrix  of  the  conjugate  transformation  is 
given  by 

A  =  A'  (31) 

From  (31)  it  follows  that  the  transformation  A  is  conjugate  to  the  transforma- 

* 

tion  A. 

14.  DEFINITION.  A  linear  transformation  A  in  a  unitary  space  is  said  to 
be  normal  if 

AA  =  AA  (32.) 

REMARK.  In  Euclidean  space  the  notion  of  a  normal  transformation  is  also 
introduced  by  means  of  formula  (32). 

THEOREM  2.  A  linear  transformation  A  in  n-dimensional  unitary  space  is 
normal  if  and  only  if  there  exists  an  orthonormal  basis  made  up  of  the  eigen¬ 
vectors  of  A. 

The  proof  of  Theorem  2  is  given  at  the  end  of  this  subsection.  Let  us  first 
establish  two  lemmas. 

LEMMA  1.  The  commutative  linear  transformations  A  and  B  (AB  =  BA), 
in  an  n-dimensional  complex  linear  space  L  always  have  a  common  eigenvec¬ 
tor. 

RROOF  OF  LEMMA  1  Since  L  is  a  complex  space,  the  transformation  A 
has  in  it  an  eigenvector  x(Ax  =  Xx,  x  ^  0).  Because  of  the  commutativity  of  A 
and  B,  every  nonzero  vector  of  type 

x,  Bx,  B~x,  ....  Bkx  (33) 

is  an  eigenvector  of  A.  Indeed, 

ABkx  =  BftAx  =  B*Xx  =  XBftx  (34) 

In  the  sequence  (33)  let  the  first  p  vectors  be  linearly  independent,  and  let 
the  (p+l)tn  vector  Bex  be  linearly  expressible  in  terms  of  them.  Then  the 
linear  hull  L  —  L(x,  Bx,  . . . ,  Be-'x)  |s  an  invariant  subspace  of  transforma- 


480 


APPENDIX  2 


lion  8.  The  transformation  B  has  an  eigenvector  y  in  L.  By  (34)  y  is  also  an 
eigenvector  of  A. 

LEMMA  2.  If  in  a  unitary  space  L  a  subspace  L'  is  invariant  under  a  linear 
transformation  A,  then  its  orthogonal  complement  L"  is  invariant  under  the 

conjugate  transformation  A. 

PROOF  OF  LEMMA  2.  Let  x  e  L\  yeL".  Then  Ax  e  L'  and  (Ax,  y)  = 
*=  0.  But  (Ax,y)  —  (x,Ay)  and  so  the  vector  Ay  is  orthogonal  to  the  vector 
x.  Since  x  is  arbitrary  in  L\  it  follows  that  Ay  e  L". 

PROOF  OF  THEOREM  2.  Let  A  and  A  be  commutative.  Then  by  Lemma  1 
they  have  a  common  eigenvector  x\.  Its  linear  hull  L\  —  L(x i)  is  invariant  un- 
*  / 
der  A  and  A.  Hence,  by  Lemma  2,  the  orthogonal  complement  Z.,  of  subspace 

*  / 

L\  is  also  invariant  under  A  and  A.  By  Lemma  1  there  will  be  in  L,  a  com- 

* 

mon  eigenvector  x2  of  transformations  A  and  A,  which  is  obviously  orthogonal 
to  X\.  We  then  consider  the  linear  hull  L2  —  L(x ux2)  and  its  orthogonal  com¬ 
plement  L",  in  which  we  find  a  common  eigenvector  *3,  and  so  forth.  Conti¬ 
nuing  this  process  we  obtain  a  set  of  11  pairwise  orthogonal  common  eigenvec¬ 
tors  x . x„  of  A  and  A.  Normalizing  these  vectors,  we  obtain  the  desired 

basis. 

REAtARK.  The  fact  that  a  set  of  n  pairwise  orthogonal  nonzero  vectors  con¬ 
stitutes  a  basis  in  n -dimensional  unitary  space  is  demonstrated  by  the  argu¬ 
ments  given  on  page  258  for  the  real  case. 

Now  let  it  be  given  that  the  transformation  A  has  an  orthonormal  basis  con¬ 
sisting  of  eigenvectors.  Then  by  Subsection  5,  Section  8,  Chapter  VII,  the  matrix 

of  A  in  this  basis  is  diagonal.  By  (31)  the  matrix  of  A  is  also  diagonal.  But 
diagonal  matrices  are  always  commutative,  and  so  also  are  the  transformations 

A  and  A.  The  proof  is  complete. 

15.  DEFINITION.  A  linear  transformation  A  in  unitary  space  is  said  to  be 
self-adjoint  or  Hermitian  if  A  =  A. 

From  the  definition  it  follows  directly  that  self-adjoint  transformations  are 
a  special  case  of  normal  transformations. 

THEOREM  3.  A  normal  transformation  A  in  n-dimensional  unitary  space  is 
self-adjoint  if  and  only  if  all  its  eigenvalues  are  real. 

PROOF.  Theorem  3  is  obvious:  it  suffices  to  write  A  in  an  orthonormal  ba¬ 
sis  made  up  of  eigenvectors  and  use  (31). 

REMARK  1.  This  theorem  shows  that  a  self-adjoint  transformation  operates 
in  n-dimensional  unitary  space  in  the  same  way  as  in  Euclidean  space:  it  con¬ 
stitutes  a  stretching  with  real  coefficients  along  the  n  mutually  orthogonal  di¬ 
rections. 

REMARK  2.  From  (31)  it  is  evident  that  A  is  self-adjoint  if  and  only  if 
its  matrix  in  an  arbitrary  orthonormal  basis  is  Hermitian  (A*  =  I). 

16.  DEFINITION.  A  nonsingular  linear  transformation  in  unitary  space  is 
said  to  he  unitary  if  A  =  A~'. 

Unitary  transformations  are  also  a  special  case  of  normal  transformations 
since  A  and  A~l  are  clearly  commutative  (AA~'  =  A~'A  =  £).  Arguing  pre¬ 
cisely  as  011  pp.  325-329,  we  can  establish  that  in  n-dimensional  unitary  space 
only  those  linear  transformations  are  unitary  that  preserve  the  vector  norm  and 
scalar  product,  which  is  to  say,  isometric  transformations. 

From  this  it  follows  immediately  that  unitary  transformations  constitute  a 
group,  called  the  unitary  group  (of  n-dimensional  space). 


MERMITIAN  FORMS.  UNITARY  SPACE 


481 


An  isometric  transformation  in  Euclidean  sqpcc  may  not  have  a  single  eigen¬ 
vector.  The  situation  in  unitary  space  is  different:  by  Theorem  2,  every  unitary 
transformation  has  an  orthonormal  basis  made  up  of  eigenvectors. 

THEOREM  4.  A  linear  transformation  A  in  n-dimensional  unitary  space  is 
unitary  if  and  only  if  there  is  an  orthonormal  basis  of  its  eigenvectors  and  all 
eigenvectors  are  equal  to  unity  in  absolute  value. 

PROOF.  The  existence  of  an  orthononiial  basis  ci . e„  of  eigenvectors 

of  the  transformation  A  follows  from  Theorem  2.  If  all  the  eigenvalues  Xj  are 
equal  to  unity  in  absolute  value  (Aet  =  Xjej,  |X/|  =  I),  then  for  an  arbitrary 
vector  x  =  £  x^j  we  have  Ax  —  £  x1  Act  —  £  x^jOf,  ||-4v||;  =  £  |  x'k/  |2  = 
=  X  I  x1  \2  =  Ml2.  Thus,  A  is  isometric  and  therefore  unitary,  if  It  is  given 
that  A  is  unitary,  then  AA  =  E.  And  so  for  an  arbitrary  eigenvector  y  we  have 
( y .  y)  —  (AAy,  y)  =  (Ay,  Ay)  =  (Xy,  Xy)  =  XX(f/,  ij) 

Hence  XX  =  |  X | 2  =  I  and  Theorem  4  is  proved. 

17.  DEFINITION.  A  nonsingular  (compex)  nXn  matrix  U  is  said  to  be 
unitary  if  it  satisfies  the  condition 

U~'  =  Z7*  (35) 

Condition  (35)  may  be  rewritten  thus:  I/O*  —  E  or  0*U  =  E,  whence  it  is 
seen  that  the  unitarity  of  matrix  U  signifies  the  orthonormality  of  the  set  of 
its  rows  and  the  set  of  its  columns  in  the  sense  of  the  scalar  product  (25). 

From  (35)  it  follows  that  unitary  matrices  constitute  a  group,  indeed: 

(1)  if  the  matrix  U  is  unitary,  then  U~'  is  also  unitary: 

1  =  u  =  ilr~y  =  (IP1)* 

(2)  if  the  matrices  U ,  and  U2  are  unitary,  then  UtU2  is  unitary: 

(c/,£/2)-' =c/2-,c/r1 = v:2u\ = (ZvJ7)‘ 

Comparing  (31)  and  (35),  we  see  that  all  unitary  matrices  (these  and  no 
others)  specify  unitary  transformations  in  orthonormal  bases.  From  this  it 
also  follows  that  unitary  n  X  n  matrices  also  form  a  group,  and  this  group  is 
isomorphic  to  the  unitary  group  of  n-dimensional  space. 

18.  THEOREM  5.  The  orthogonal  group  of  n-dimensional  Euclidean  space 
is  isomorphic  to  a  subgroup  of  the  unitary  group  of  n-dimensional  unitary  spa¬ 
ce,  which  group  in  turn  is  isomorphic  to  a  subgroup  of  the  rotation  group  of 
2n-dimensional  Euclidean  space,  which  subgroup  coincides  with  the  whole  rota¬ 
tion  group  in  the  single  special  case  n  =  1. 

PROOF.  The  first  assertion  of  the  theorem  follows  directly  from  the  fact 
that  orthogonal  matrices  are  a  special  case  of  unitary  matrices:  a  matrix  is 
orthogonal  if  and  only  if  it  is  unitary  and  real.  We  note  in  passing  that  for  any 
n  the  orthogonal  group  does  not  exhaust  the  whole  unitary  group,  since  there 
are  unitary  but  not  orthogonal  n  X  «  matrices.  Such  for  instance  is  any  diago¬ 
nal  matrix  in  which  \an\  =  1  for  all  /  but  at  least  one  of  the  numbers  a, j  is 
nonreal. 

For  n—  1  the  unitary  matrix  U  has  a  unique  element  a,t  with  |an|  =  1, 
since  in  this  case  the  matrix  equation  UO*  —  E  reduces  to  the  numerical  equa¬ 
tion  aiian  =  1.  The  space  here  is  the  plane  of  a  complex  variable  with  scalar 
product  (x,  y)  =  xy.  For  an  orthonormal  basis  we  can  take  the  number  e  =  1 
on  the  plane.  The  transformation  specified  in  this  basis  by  the  matrix  U  = 


482 


APPENDIX  2 


=  ||<2ii||  =  ||cos  <p +( sin  q>||  is  nothing  but  the  rotation  o(  the  complex  plane 
through  the  angle  q>.  Since  the  angle  here  can  be  arbitrary,  we  established  that 
when  n  =  1,  the  unitary  group  actually  coincides  with  the  rotation  group  of 
a  two-dimensional  Euclidean  plane. 

Furthermore,  assuming  n  3s  2  we  take  advantage  of  the  construction  given 
in  Subsection  11.  This  construction  permits  regarding  one  and  the  same  space 
as  a  unitary  space  C„  and  as  a  Euclidean  space  £2n.  At  the  same  time  it  per¬ 
mits  viewing  each  unitary  transformation  in  C„  as  an  isometric  linear  trans¬ 
formation  in  £2„.  Let  U  be  a  unitary  transformation  in  C„.  By  Theorem  4  there 
is  an  orthonormal  basis  eu  ....  e„  made  up  of  eigenvectors,  and  the  eigenva¬ 
lues  are  of  the  form  cos  q>i.  +  <  sin  cpx,  k=l . n.  Set  7*  =  iek,  Lh — 

—  L(ek,h)  e  £2n.  The  transformation  U,  when  regarded  in  £2„,  operates  in 


this  rotation  group  has  elements  that  cannot  be  regarded  as  unitary  or  even  as 
linear  transformations  in  C„.  Such  for  example  is  the  rotation  B  that  carries 

the  basis  vectors  eh  llt  e2,  h,  e„,  l„  into  the  vectors  e,,  e2,  —  7|,  /2 . e„, 

respectively,  which  is  to  say  that  it  rotates  the  plane  /.(/,,  e2)  through  the 

angle  and  leaves  fixed  the  remaining  vectors  of  the  basis  e,,  ...,  In- 

True  enough,  for  in  the  complex  space  C„  the  linearity  of  the  transforma¬ 
tion  A  and  the  condition  Ae,  —  e ,  should  imply  Al t  =  A(ie\)  —  iAe i  =  7,  but 
by  no  means  Al,  =  e2.  The  proof  of  Theorem  5  is  complete. 

REMARK.  The  operation  of  rotation  B  when  n  =  2  is  shown  schematically 
in  Fig.  125. 

19.  Let  L  be  an  n- dimensional  complex  space.  As  in  Subsections  6-9,  Section 
6,  Chapter  VIII,  we  establish  the  following. 

(1)  If  a  unitary  metric  is  introduced  in  L,  then  the  collection  of  all  ortho¬ 
normal  bases  in  the  metric  forms  a  class  of  bases  relative  to  the  unitary  group. 

(2)  For  every  class  of  bases  relative  to  the  unitary  group  we  can  indicate 
a  unitary  metric  in  which  the  bases  of  this  class  are  orthonormal. 

(3)  If  the  unitary  metric  has  been  chosen,  then  the  transformation  of  coor¬ 
dinates  when  passing  from  one  orthonormal  basis  to  another  (also  orthonor¬ 
mal)  basis  is  specified  by  a  unitary  matrix. 


HERMITIAN  FORMS.  UNITARY  SPACE 


483 


20.  As  in  Sections  4-5  ol  Chapter  IX,  proof  •an  be  given  that  a  Hermitian 
form  can  be  reduced  to  canonical  form  (13)  by  a  transformation  of  variables 
with  the  unitary  matrix,  and  _that  a  pair  of  Hermitian  forms  can  be  brought  to 
canonical  form  via  a  nonsingular  transformation  of  the  variables  if  at  least 
one  of  them  is  positive  definite. 

REMARK.  Let  A  be  the  matrix  of  the  Hermitian  form  f  in  the  initial  basis 
e.  To  bring  the  form  to_  canonical  form  we  need  an  auxiliary  Hermitian  trans¬ 
formation  with  matrix  A  —  A'  (in  the  same  basis  c) .  l  et  i  —  Pc  be  an  ortho- 
normal  basis  of  eigenvectors  of  this  transformation.  Relative  to  the  basis  e,  the 
matrix  A  of  the  transformation  at  hand  is  diagonal  and  real  with  A  — 
=  QAQ-'  =  QAP~'  according  to  Section  2,  Chapter  VII.  When  passing  from 
basis  e  to  basis  e,  the  matrix  of  the_forin  /  transforms  via  (12):  A'  =  PAP'. 
Since  the  matrix  P  is  unitary,  P~'  =  P*,  we  have 

Q  =  (p-')*  =  p,  P  =  q  (36) 

Taking  into  account  the  foregoing,  we  see  that 

A’  =  PAP’  =  QAP’  =  (QAP‘)  =  (X)  =  A  (37) 


so  that  the  form  f  is  of  canonical  form  in  the  basis  e. 

EXAMPLE.  Let  the  Hermitian  form  f  in  the  orthonormal  basis  et,  e2  be  gi¬ 
ven  by  (14).  that  is,  _ 

f  =  (l  +  3/)x'*2  +  (l  -3i)x2x' 


It  is  required  to  bring  it  to  canonical  form  while  remaining  in  the  class  of 
orthonormal  bases.  For  the  auxiliary  Hermitian  transformation  with  the  matrix 


A* 


0  1-3/ 

1  +  3/  0 

we  find  the  characteristic  polynomial  det  (A*—  X£)  =  det  (A—  XE)  =  X2 — 
—  10,  its  roots  XI)2  =  ±  V'°.  and  the  eigenvectors  and  e2  with  the  eigen¬ 
values  X/  and  X2,  respectively: 


ei  = 


e2  ■■ 


1 


3/ 

—  ei 


2  V5 

I 


+ 


V2 


ei  + 


V2 

1  +  3/ 
- / —  ^2 

2  V5 


(38) 


The  vectors  (38)  have  already  been  normalized:  ||d|||  =  ||e2||  =  1;  they  consti¬ 
tute  the  desired  basis  in  which 


f  =  X,//>y'  +  //'  -  VlOj/2;/2 


Taking  into  account  (36),  it  is  easy  to  write  down  the  transformation  of  coor¬ 
dinates: 


y' 

y* 


1  +3/ 

2  V5" 


x'  + 


~—=rX'  + 

V2 


I 

Vir 

I  - 


2^1 


3/  . 

—  x2 


21.  We  conclude  this  section  with  the  statement  of  an  important  theorem 
that  may  be  proved  by  analogy  with  Section  1 1  of  Chapter  IX. 

THEOREM  6.  For  any  nonsingular  linear  transformation  A  in  n-dimensio- 
nal  unitary  space  there  is  a  self-adjoint  transformation  H  and  a  unitary  trans¬ 
formation  U  such  that  A  =  UH. 


Bibliography 


1.  Aleksandrov,  P.  S.  Lektsii  po  analiticheskoi  geometrii...  (Lectures  in  Ana¬ 

lytic  Geometry,  supplemented  with  necessary  facts  from  algebra 
together  with  a  collection  ol  problems  with  solutions  compiled  by 
A.  S.  Parkhomenko),  Moscow,  Nauka  (1968). 

2.  Bishop,  R.  L.  and  Crittenden,  R.  J.  Geometry  of  Manifolds,  New  York,  Aca¬ 

demic  Press  (1964). 

3.  Bourbaki,  N.  Elements  de  mathimatique.  Livre  II.  Algebre :  Ch.  1.  Struc¬ 

tures  algebriques,  Paris,  Hermann  (1958);  Ch.  2.  Algebre  lineaire, 
Paris,  Hermann  (1955);  Ch.  3.  Algebre  multilineaire,  Paris,  Her¬ 
mann  (1948). 

4.  Bourbaki,  N.  Element  de  mathimatique.  Livre  II.  Algebre :  Ch.  6.  Groupes 

et  corps  ordonnes\  Ch.  7.  Modules  sur  les  anneaux  principaux,  Pa¬ 
ris,  Hermann  (1952);  Ch.  8.  Modules  et  anneaux  semi-simples,  Pa¬ 
ris,  Hermann  (1958);  Ch.  9.  Formes  sesquilineaires  et  formes  quad- 
ratiques,  Paris,  Hermann  (1959). 

5.  Efimov,  N.  V.  Vysshaya  geometriya  (Higher  Geometry),  Moskow,  Fizmat- 

giz  (1961). 

6.  Faddeev,  D.  K.  and  Faddeeva,  V.  N.  Computational  Methods  of  Linear  Al¬ 

gebra,  San  Francisco,  Freeman  (1963). 

7.  Faddeev,  D.  K.  and  Sominsky,  I.  S.  Problems  in  Higher  Algebra,  Moscow, 

Mir  Publishers  (1972). 

8.  Gantmakher,  F.  R.  The  Theory  of  Matrices,  Vols.  1  and  2,  New  York,  Chel¬ 

sea  (1959). 

9.  Gel’fand,  I.  M.  Lectures  on  Linear  Algebra,  New  York,  Interscience  (1961). 

10.  Kurosh,  A.  G.  Higher  Algebra,  Moscow,  Mir  Publishers  (1972). 

11.  Lang,  S.  Algebra,  Reading,  Mass.,  Addison-Wesley  (1965). 

12.  Mal'cev,  A.  I.  Foundations  of  Linear  Algebra,  San  Francisco,  Freeman 

(1963). 

13.  Manning,  H.  P.  Geometry  of  Four  Dimensions,  New  York,  Dover  (1956). 

14.  Proskuryakov,  I.  V.  Sbornik  zadach  po  lineinoi  algebre  (Problem  Book  in 

Linear  Algebra),  Moscow,  Nauka  (1967). 

15.  Rashevsky,  P.  K.  Rimanova  geometriya  i  tenzorny  analiz  (Riemannian  Geo¬ 

metry  and  Tensor  Analysis),  Moscow,  Gostekhizdat  (1953). 

16.  Reichaidt,  H.  Vorlesungen  uber  Vektor-  und  Tensorrechnung,  Berlin,  Deut- 

scher  Verlag  der  Wissenschaften  (1968). 

17.  Rozenfeld,  B.  A.  Mnogomernye  prostranstva  (Multidimensional  Spaces), 

Moscow,  Nauka  (1966). 

18.  Schreier,  O.  and  Sperner,  E.  Introduction  to  Modern  Algebra  and  Matrix 

Theory,  New  York,  Chelsea  (1961). 

19.  Shilov,  G,  E.  Linear  Algebra,  Engelwood  Clifis,  N.  J.,  Prentice-Hall  (1961). 


BIRI.IOGRAPHY  485 


20.  Shilov,  G.  E.  Matematicheski  analU.  Koncchnomerni/e  lineinye  prostranstva 

(Mathematical  Analysis.  Finite-dimensional  Linear  Spaces),  Mos¬ 
cow,  Nauka  (1969). 

21.  Sommerville,  D.  M.  Y.  An  Introduction  to  the  Geometry  of  N  Dimensions, 

London,  Methuen  (1929). 

22.  Spivak,  M.  Calculus  on  Manifolds,  New  York,  Benjamin  (1965). 

23.  Sternberg.  S.  Lectures  on  Differential  Geometry,  Engelwood  Gliffs,  N.  J., 

Prentice-Hall  (1964). 

24.  Tyshkevich,  R.  I.  and  Fedenko,  A  S.  Lineinuya  algebra  i  analiticheskaya 

geometriya  (Linear  Algebra  and  Analytic  Geometry),  Minsk,  Vy- 
sheina  shkola  (1968). 

25.  Yudin,  D.  B.  and  Gol'shtein,  E.  G.  Lineinoye  programtnirovanie  (Linear 

Programming),  Moscow,  Nauka  (1969). 


Index 


Abelian  group  181 
addition  (group  theory)  181 
adjoint  of  a  transformation  308 
admissible  replacement,  15,  150-151 
affine  classification  of  quadric  hyper¬ 
surfaces  414 
affine  coordinates  72 
affine  equivalence  of  figures  414 
affine  group  414 
affine  space  70,  71 
bundle  of  planes  in  435 
homogeneous  coodinates  in  422 
affine  system  of  coordinates  72 
affine  transformations  410,  411,  434 
a  f finely  equivalent  figures  414 
Aleksandrov,  P.  S.  484 
algebra  353 
Grassmann  352 
linear  12 
tensor  142 

tensor  (in  quadratic-metric  spaces) 
287 

alternate  product  351 
alternation  170,  172,  346 
of  components  of  a  tensor  350 
of  a  form  376 
of  indices  350 
of  a  tensor  348 
angular  velocity  337 
anharmonic  ratio  432 
antisymmetrization  170,  172 
associative  property  16 
asymptotic  cone  418 
asymptotic  direction  417 
asymptotic  lines  418 
augmented  matrix  77 
axes  (see  axis) 
semiconjugatc  405 
semitransverse  405 
axial  invariants  191,  195 


axial  pseudoinvariants  196,  198 
axial  pseudotensors  199 
axial  tensors  198 

axioms  of  a  linear  space  12,  15,  17 
corollaries  to  23 
axis  (see  axes) 

instantaneous  (or  rotation)  337 


barycentric  coordinates  105 
bases  ( see  basis)  34 
class  of  183 
left-handed  185,  186 
negatively  oriented  185,  186 
positively  oriented  185,  186 
right-handed  185,  186 
basis  (see  bases)  34,  35 
canonical  233,  243 
orthonormal  258,  259 
reciprocal  142,  289 
in  a  tensor  product  153 
basis  columns  28 
basis  minor  27,  28 
lemma  on  27,  28 
basis  rows  28 
bilinear  forms  112,  113,  471 
coefficients  of  1 13 
matrix  of  116 
nonsingular  135 
rank  of  118 
singular  135 
skew-symmetric  114 
symmetric  114 

bilinear  Hermitian  form  472 
binormal  vector  338 
Bishop,  R.L  484 
bivector(s)  357 
area  of  359 
direction  of  358 
projection  of  360 


INDEX 


487 


bivector(s) 
rank  of  362 
simple  357 
subspace  of  358 
unit  359 
bounded  set  102 
Bourbaki,  N.  484 

bundle  of  planes  in  affine  space  435 
Bunyakovsky  ( see  Cauchy-Buuyakov- 
sky  inequality) 

canonical  basis  233,  243 
Capelli  ( see  Kronecker-Capelli  theo¬ 
rem) 

Cartan’s  lemma  365-366 
for  outer  forms  386 

Cauchy-Bunyakovsky  inequality  132, 
133 

Cayley  ( see  Hamillon-Cayley  formula) 
central  projection  443,  444,  446 
characteristic  determinant  225 
characteristic  equation  225 
characteristic  matrix  225 
characteristic  polynomial  225,  227 
circle 

imaginary-unit  278 
unit  278 

class  of  bases  183 
classification 

of  linear  quantities,  theorem  of  467 
projective  (of  quadric  hypersur¬ 
faces)  453 
coefficients 
of  a  cubic  form  172 
principal  (of  a  skew  form)  173 
of  similarity  218 
of  volume  change  345 
commutative  group  181 
commutative  property  for  addition  16 
complete  quadrangle  451 
component  (s) 
contravariant  293 
covariant  293 

of  multiple-order  tensors  166 
of  a  tensor  153,  155,  197 
of  a  vector  35 

component  representation  209 
compression  220,  221 
cone  137 
asymptotic  418 
imaginary  139 
real  409 
zero  137,  139 
conjugate  directions  418 
conjugate  space  111,  147,  148 
conjugate  transformation  479 
consistent  system  77 


constant  term  78,  391 
contraction  145,  147,  152 
complete  166 
inner  158,  166 
left  152 
right  152 

contraction  factor  218 
contravariance,  order  of  200 
contravariant  components  293 
contravariant  metric  tensor  289 
contravariant  tensors  155 
contravariant  vectors  142,  146 
convex  hull  102 
convex  polyhedrons  98,  100 
convex  set  99,  100 

coordinate(s)  (see  components)  197 
affine  72 
barycentric  105 
homogeneous  422,  426 
running  391 
of  a  vector  35 

coordinate  representation  209 
coordinate  system,  projective  427 
cosine,  direction  298 
covariance,  order  of  200 
covariant  components  293 
covariant  tensors  155 
covariant  vectors  142,  146 
criterion,  Sylvester's  131 
Crittenden,  R.  J.  484 
cross  ratio  432 
cubic  form  172 
coefficients  of  172 

curvature  of  a  space  curve  338,  340 
curves  454 
oval  454 
zero  454 

cylinder  (hypersurface)  409 
parabolic  410 


Darboux  vector  339 
degenerate  surfaces  409 
determinant 
characteristic  225 
Gram’s  132 
diagonal  points  451 
diametral  hyperplane  420 
differential  of  a  nonlinear  transforma¬ 
tion  343 

dimension  of  a  linear  space  34 
direct  sum  48,  49 
direction  (s) 
asymptotic  417 
conjugate  418 
direction  bivector  358 
direction  cosines  298 


488 


INDEX 


direction  inultivector  367 
direction  subspace  73 
direction  vector  358 

discriminant  tensor  201,  204,  304,  307 
distributive  property  17 
divisors,  elementary  247 
double  point  intersection  460 
duality  principle  in  theory  of  polars 
465 

dummy  indices  55 

edge  of  a  parallelepiped  101-102 
edge  of  a  simplex  105 
Efimov,  N.  V.  484 
eigenvalue  228 
eigenvector  224,  227 
elastic  medium,  continuous  343 
elasticity,  theory  of  222,  301,  343 
element  (s) 
ideal  425 
inverse  16 
of  a  matrix  27 
zero  16 

elementary  divisors  247 
ellipsoid 

(n  — -  1) -dimensional  404 
imaginary  405 
semiaxes  of  405 
elliptic  rotation  276 
equation(s) 
canonical  392 

of  the  centre  (of  a  quadric  hyper- 
surface)  398 
characteristic  225 

general  (of  quadric  hypersurface) 
391,  392,  399 

of  hyperplane  (in  quadratic-metric 
space)  295 
normal  302 
parametric  47,  76 
systems  of  first-degree  77 
equivalence,  projective  (of  figures) 
446 

Euclidean  linear  space  297 
Euclidean  rotations,  group  of  270,  273 
Euclidean  space  259,  297 
expansion  of  a  vector  35 
expansion  factor  218 

face(s) 

opposite  (of  a  simplex)  105 
of  a  parallelepiped  101 
of  a  simplex  105 
factor,  contraction,  218 
expansion  218 
Fadaeev,  D.  K.  484 
Faddeeva,  V.  N.  484 


Fedenko,  A.  S.  485 
figures 

affinely  equivalent  414 
projectively  equivalent  447 
form(s) 

bilinear  112,  113,  471 
bilinear  Herinitian  472 
cubic  172 
of  degree  k  109 
&-form  378 

Hermitian  471,  472,  473 
Jordan  normal  242,  250 
linear  108,  109 
metric  256,  257,  475 
multilinear  168,  170 
normal  (of  a  quadratic  form)  124, 
125 

outer  376,  378 
outer  quadratic  384 
quadratic  109,  118 
quadratic  Hermitian  472 
skew  171,  377 
skew-symmetric  170,  171 
symmetric  170 
tensor  of  170 
trilinear  170 
formula  (s) 

Frenet  340 
Hainilton-Caylcy  252 
free  indices  55 
Frenet  formulas  340 
function  (s) 
linear  108,  207 

linear  (of  first  and  second  kind)  471 
orthogonal  278 
functional  analysis  12 
fundamental  set  of  solutions  83 

Gantmakher,  F.  R.  484 
Gel’fand,  I.  M.  484 
general  position  75 
geometry 
definition  of  196 
Minkowski  270,  278 
multidimensional  analytic  13 
projective  427,  429,  430,  434 
Gol’shtein,  E.  G.  485 
Gram’s  determinant  132 
Grassmann  algebra  352 
group  180 
Abelian  181 
affine  414 
commutative  181 
of  Euclidean  rotations  270,  273 
of  first-degree  terms  391 
of  higher-degree  terms  391 
of  hyperbolic  rotations  278 


INDEX 


489 


group 

isomorphic  189 

of  linear  transformations  189 
of  nonsingular  linear  transforma¬ 
tions  213 

orthogonal  297,  299 
fe-orthogonal  266 
projective  427 
rotation  273,  300 
transformation  186,  188 
unimodular  196 
unitary  480 


half-spaces  (closed,  open)  99 
Hamilton-Cayley  formula  252 
harmonic  set  448 
harmonically  divides  448 
harmonically  situated  463 
height  of  a  nilpotent  transformation 
230 

Hermitian  forms  471-473 
law  of  inertia  of  474 
law  of  transformation  of  coefficients 
of  473 

matrix  of  473 
nonsingular  473 
rank  of  473 

Sylvester’s  criterion  for  474 
Hermitian  matrix  474 
Hermitian  metric  476 
Hermitian  transformation  480 
homogeneous  coordinates  422,  426 
homogeneous  systems  81-82 
homomorphism  186,  190 
examples  of  191 
kernel  of  190 
Hooke’s  law  345 
hull 

convex  102 
linear  44 

hyperbolic  rotations  278,  284 
hyperboloid  405,  406 
hyperplane  74 
diametral  420 

equation  of  (in  quadratic-metric 
space)  295 
ideal  424 

normal  equation  of  (in  Euclidean 
space)  302 

in  projective  space  428 
hypersurface(s) 
central  399 
imaginary  391 
quadric  391 
singular  point  of  461 
zero  391 


ideal  elements  425 
ideal  hyperplane  424 
ideal  points  424 

identity  transformation  62,  187,  207 
identity  matrix  62 
image  186 
inverse  186 

imaginary-unit  circle  278 
inclusion  (of  sets)  45 
index  ( see  indices) 
negative  126 
positive  126,  259 
indices  (.see  index) 
dummy  55 
free  55 

“juggling”  293 
umbral  55 

inequality,  Cauchy-Bunyakovsky  132, 
133 

inertia,  law  of  (of  quadratic  forms) 
125-126 

inner-product  spaces  256 
intersection 
double  point  of  460 
of  quadric  hypersurface  and  a  line 
459 

invariance 

in  an  orthogonal  group  274 
projective  (of  a  polar)  462 
invariant(s)  191,  195,  196 
axial  191,  195 
invariant  subspaces  216 
inverse  180,  181 
of  a  matrix  62 
of  a  transformation  186 
inverse  element  16 
inverse  image  186 
invertible  square  matrix  60 
isometric  transformations  273,  325, 

327,  413 

canonical  form  of  330 
isometry  413 
isomorphic  groups  189 
isomorphic  tensor  products  178 
isomorphism  186,  188,  190 
linear  38 
metric  265 

isotropic  vectors  256,  278 

Jacobi’s  method  ( see  reducing  a  quad¬ 
ratic  form  to  canonical  form) 
Jordan  normal  form  242,  250 
Jordan  submatrix  223 
“juggling”  indices  293 

kernel  of  a  homomorphism  190 
Kronecker-Capelli  theorem  77 


490 


INDEX 


Kronecker  delta  69 
Kurosh,  A.  G.  484 

Lagrange’s  method  (see  reducing  a 
quadratic  form  to  canonical  form) 
Lang,  S.  484 
law 

Hooke’s  345 

of  inertia  of  Hermitian  forms  474 
of  inertia  of  quadratic  forms  125 
of  transformation  of  a  scalar  quan¬ 
tity  193 

left  zero  subspace  134 
Legendre  polynomials  264 
lemma 

basic  (on  two  systems  of  vectors) 
30 

on  basis  minor  27,  28 
Cartan’s  365 

Cartan's  (for  outer  forms)  386 
line,  projective  430 
line  segment,  midpoint  of  99 
linear  combinations  25,  26 
nontrivial  25 
trivial  25 

linear  dependence  25 
linear  forms  108,  109 
linear  function  (s)  108,  207 
of  first  kind  471 
of  second  kind  471 
linear  geometric  entities  194 
linear  hull  44 

linear  inequalities,  systems  of  98 
linear  isomorphism  38 
linear  map  207,  209 
linear  mapping  207 
linear  operator  207,  209 
linear  operations  1 1 
in  components,  36 
linear  programming  107 
linear  quantities,  theorem  on  classi¬ 
fication  of  467 

linear  scalar  quantities  194,  195 
linear  space(s)  12,  17,  20,  210 
complex  21 
dimension  of  34 
n-dimensiona!  34 
finite-dimensional  12 
infinite-dimensional  12 
isomorphism  between  38 
real  21 

zero-dimensional  34 
linear  subspace  42 
linear  transformations 
of  linear  spaces  207(1 
of  a  tensor  210f 
in  unitary  spaces  478 


linear  transformations 
of  variables  56,  57,  60 
linearly  isomorphic  spaces  38 

Mal’cev,  A.  1.  484 
Manning,  H.  P.  484 
map,  linear  207,  210 
mapping,  linear  207 
projective  444,  446 
matrices  (see  matrix) 
equivalent  250 
fe-orlhogonal  266,  268 
orthogonal  270,  297,  299 
matrix  (see  matrices) 
augmented  77 

basic  (of  a  system  of  equations)  77 

of  a  bilinear  form  1 16 

characteristic  225 

complex  20 

elements  of  27 

Hermitian  474 

of  a  Hermitian  form  473 

identity  62 

inverse  of  62 

invertible  square  60 

of  a  linear  transformation  56 

nonsingular  square  60 

of  a  quadratic  form  120 

rank  of  32 

real  20 

rectangular  20 
square  60 
transpose  of  59 
unit  62,  181 
unitary  481 
zero  21 

matrix  product  57 
membership  in  a  set  38 
metric(s) 

Hermitian  476 
Minkowski  287 
unitary  475 
metric  concepts  76 
metric  form  256,  257,  475 
metric  isomorphism  265 
metric  tensor  288,  293 
metrically  isomorphic  spaces  265 
midpoint  of  a  line  segment  99 
Minkowski  geometry  270,  278 
Minkowski  metrics  287 
Minkowski  space  259 
minor(s)  27 
basis  27,  28 
lemma  on  27,  28 
bordering  27,  28 
principal  127 
mixed  tensors  155 


INDEX 


49! 


Mobius  strip  450 
moving  trihedral  339 
multilinear  forms  168,  170 
multiple-order  tensors  162 
components  of  166 
multiplication 
(group  operation)  180 
outer  353 
scalar  254 
multivectors  351 
contravariant  352 
direction  367 
order  of  356,  366 
and  outer  forms  346fl 
projection  369 
simple  contravariant  366 
unit  368 
zero  352 

negative  definite  quadratic  form  129 
nilpotent  transformation  229,  230 
canonical  basis  of  233 
height  of  230 

nondegenerate  quadric  hypersurface 
454 

nonhomogeneous  system  88 
nonsingular  bilinear  form  135 
nonsingular  linear  transformation  60 
nonsingular  square  matrix  60 
nonsingular  transformation  60,  214 
nonsingularity  215 
nontrivial  solution  84 
norm  of  a  vector  256 
normal  equation  of  hyperplane  in  Euc¬ 
lidean  space  302 

normal  form  of  a  quadratic  form  124, 
125 

null  space  of  a  transformation  213 

one-to-one  transformation  186 
operator,  linear,  207,  209 
order  of  covariance,  of  contravariance 
200 

orientation  of  a  basis  185,  186 
positive  204 

oriented  volume  of  a  parallelepiped 
201,  206 

orthogonal  complement  255 
orthogonal  functions  278 
orthogonal  group  297,  299 
orthogonal  matrices,  270,  297,  299 
orthogonal  projection  259 
orthogonal  subspaces  255 
orthogonal  systems  of  functions  20 1 
orthogonal  transformation  300 
orthogonal  vectors  255 
orthogonalization  259,  261 


orthonormal  basis  258,  259 
osculating  plane  338 
outer  forms  376,  378 
and  covariant  multivectors  379 
of  degree  two  384 
outer  product  of  378 
theorems  on  386 

in  three-dimensional  Euclidean 
space  386 

outer  multiplication  353 
outer  product  351 
of  outer  forms  378 
outer  quadratic  form  381 
oval  curve  454 
oval  surface  455 

paraboloids  408 
parallelepiped  101,  102,  204 
volume  of  (in  Euclidean  space)  304 
parallelogram  204 
Parkhomenko,  A.  S.  484 
plane(s)  73 

bundle  of  (in  affine  space)  435 
^-dimensional  429 
r-dimensional  73 
intersecting  91 
mutual  positions  of  91 
osculating  338 
parallel  93 
projective  429,  454 
skew  94,  429 
point(s) 
diagonal  451 

double  (of  intersection)  460 
forth  harmonic  448 
ideal  424 

at  infinity  422,  424 
in  projective  space  425 
running  73 

singular  (of  a  hypersurface)  461 
units  442 
polar(s)  459,  461 

duality  principle  in  theory  of  465 
projective  invariance  of  462 
properties  of  465 
of  a  quadratic  form  118 
pole  (of  a  hyperplane)  461 
polyhedrons,  convex  98,  100 
polynomial  (s) 
characteristic  225,  227 
Legendre  263 
of  transformations  208 
position,  general  75 
positive  definite  quadratic  form  129 
positive  definiteness  129 
positive  index  of  a  space  259 
principal  minors  of  a  matrix  127 


492 


INDEX 


principle,  duality  (in  theory  of  polars) 
465 

product(s) 
alternate  351 

of  geometric  vector  by  a  complex 
number  41 
(group  theory)  180 
outer  351 
scalar  254 
symbolic  150 

tensor  149ff,  162,  174,  177 
of  transformations  187,  188 
vector  304,  370,  371 
programming,  linear  107 
projection  444 
of  a  bivector  360 
central  443,  444,  446 
orthogonal  259 
of  a  space  223 

of  a  vector  on  a  normal  303 
projective  classification  of  quadric  hy¬ 
persurfaces  453 

projective  equivalence  of  figures  446 
projective  geometry  427,  429,  430,  434 
projective  group  427 
projective  invariance  of  a  polar  462 
projective  line  430 
projective  mapping  444,  446 
projective  plane  429,  454 
projective  space  42211 
complex  426 
hyperplane  in  428 
model  of  435 
planes  in  426 
points  in  425 

quadric  hypersurface  in  430 
real  426,  442 
projective  structures  426 
projective  systems  of  coordinates  427 
projective  transformation  427 
Proskuryakov,  I.  V.  484 
pseudoinvariants  191,  196,  198 
axial  196,  198 
pseudotensors  198 
axial  199 


quadrangle,  complete  451 
quadratic  form  109,  118 
law  of  inertia  of  125 
matrix  of  120 
negative  definite  129 
normal  form  of  124,  125 
polar  of  1 18 
positive  definite  129 
rank  of  120 

reduction  of  to  canonical  form  121  ff 
317,  319 


quadratic  form 
signature  of  125 
zero  cone  of  137,  138 
zero  subspace  of  136 
quadratic-metric  spaces  254,  256 
quadric  hypersurfaces  391  ff 
affine  classification  of  414 
centre  of  397,  398 

classification  of  (in  Euclidean 
space)  402 

general  equation  of  391,  392,  399 
intersection  of  a  straight  line  with 
415,  459 

nondegenerate  404,  454 
projective  classification  of  453 
in  projective  space  430 
quasi-Euclidean  space  259,  266 


rank 

of  a  bilinear  form  118 
of  a  bivector  362 
of  a  matrix  32 
theorem  on  32 

of  a  product  of  matrices  64 
of  a  quadratic  form  120 
of  a  system  of  equations  79 
of  a  system  of  vectors  31 
rank  subspace  362 
Rashevsky,  P.  K.  484 
ratio 

anharmonic  432 
cross  432 

reciprocal  bases  142,  289 
reducing  a  quadratic  form  to  canoni¬ 
cal  form  121 

Jacobi’s  method  of  121,  127 
Lagrange’s  method  of  121 
reflections  274 
Reichardt,  H.  484 
relativity,  theory  of  408 
replacements,  admissible  15,  151 
representation 
component  209 
coordinate  209 

rigid  body  (with  one  fixed  point),  mo¬ 
tion  of  335 

right  zero  subspace  134 
Rozenfeld,  B.  A.  484 
rotation (s)  300 
elliptic  276 
Euclidean  270 
of  Euclidean  plane  273 
hyperbolic  278,  284 
rotation  group  273,  300 
running  coordinates  39< 
running  point  73 


1NDFX 


493 


scalar,  arbitrary  15 
scalar  multiplication  254 
scalar  products  254 
scalar  quantity  192 
law  of  transformation  of  192 
Schreier,  O.  484 

self-adjoint  transformations  310,  311(1, 
480 

self-conjugate  spaces  292 
self-polar  set  of  points  465 
semiaxes  of  ellipsoid  405 
sequence  of  vectors  233 
set(s) 

bounded  102 
convex  99,  100 

fundamental  (of  solutions)  83 
harmonic  448 
inclusion  in  45 
membership  in  38 
self-polar  (of  points)  465 
sheaf  of  planes  (see  bundle  of  planes) 
435 

Shilov,  G.  E.  485 

signature  (of  a  quadratic  form)  125 
similarity,  coefficient  of  218 
similarity  transformation  218,  226 
simplex  105,  106 
centre  105 
r-dimensional  105 
edge  of  105 
face  of  105 
opposite  faces  of  105 
singular  bilinear  form  135 
singular  point  of  a  hypersurface  461 
singular  transformations  229 
skew-adjoint  transformations  310,  322 
skew  form  171,  377 
skew  planes  94,  429 
skew-symmetric  bilinear  form  114 
skew-symmetric  forms  170,  171 
skew  tensor  351 
skew  transformations  310,  322 
solution(s) 

fundamental  set  of  83 
nontrivial  82 
trivial  82 

Sominsky,  I.  S.  484 
Sommerville,  D.  M.  Y.  485 
space (s)  12 
affine  70,  71,  442 
complex  coordinate  20 
complex  vector  17 
conjugate  111,  147,  148 
of  continuous  functions  21 
coordinate  19 

correspondence  between  complex  and 
real  40 


space  (s) 

Euclidean  259,  297 
Euclidean  linear  297 
finite-dimensional  34 
of  functions  intcgrable  on  an  inter¬ 
val  23 

of  geometric  vectors  18 
with  a  llermitian  metric  476 
infinite-dimensional  34 
inner-product  256 
ol  intcgrable  functions  22 
linear  (see  linear  spaces)  210 
linearly  isomorphic  38 
of  matrices  20 
metric  form  of  475 
metrically  isomorphic  265 
Minkowski  259 
null  213 

projective  4221T  425,  426 
quadratic-metric  254,  256 
quasi-Euclidean  259,  266 
real  coordinate  19 
real  vector  17 
self-conjugate  292 
unitary  471,  476 
vector  13,  17 
zero  18 

zero-dimensional  linear  34 
Sperner,  E.  484 
Spivak,  M.  485 

sphere,  (n  —  1) -dimensional  405 
Sternberg,  S.  485 
strain  tensor  343,  345 
stress  345 

stress  tensor  343,  345 
subgroup(s)  181 
examples  of  182 
Submatrix,  Jordan  223 
subspace (s) 
direction  73 
invariant  216 
linear  42 
orthogonal  255 
rank  362 
sum  of  47 
sum 

direct  47,  48 
inner  54 
outer  54 
of  subspaces  47 
symbolic  150 

summation,  notation  for  53 
summation  symbol,  properties  of  53 
surface 

degenerate  409 
oval  455 
toroidal  455,  457 


494 


INDEX 


surface 
zero  455 

Sylvester’s  criterion  131 
for  Hermitian  forms  474 
symbolic  product  150 
symbolic  sum  150 
symmetric  bilinear  form  114 
symmetric  forms  170 
symmetrization  170,  171 
systems  of  first-degree  equations  77 
coefficients  of  77 
consistent  77 
homogeneous  81 
nonhomogeneous  88 
solution  of  77 

systems  of  functions,  orthogonal  264 
systems  of  linear  inequalities  98 
systems  of  vectors  26 


tensor (s) 
axial  198 

of  bilinear  forms  159 
components  of  153,  155 
contravariant  155 
contravariant  metric  289 
corresponding  179 
covariant  155 

discriminant  201,  204,  304,  305 
metric  287,  294 
mixed  155,  198 
multiple-order  162 
of  a  form  169 
of  order  one  155,  163 
of  order  two,  test  for  distinguishing 
162 

of  order  zero  159 
skew  351 

over  spaces  152,  163,  177 
strain  343,  345 
stress  343,  345 
tensor  algebra  1 42fT 
in  quadratic-metric  spaces  287 
tensor  components  153,  197 
tensor  product  162,  163 
tensor  product  of  linear  spaces  149fT. 
162,  177 

alternative  description  of  174 
definition  of  152 
isomorphic  178 
tensor  quantities  197ff,  200 
term,  constant  77 
theorem 

on  classification  of  linear  quantities 
465 

Kronecker-Capelli  77 
on  outer  forms  386 


theorem 

on  rank  of  a  matrix  32 
Vieta’s  419,  463 

theory  of  elasticity  222,  301,  343 
theory  of  relativity  408 
three-point  449 
toroidal  surface  455,  457 
torsion  of  a  space  curve  338,  340 
torus  455 

trace  of  a  matrix  of  a  linear  transfor¬ 
mation  212 

trace  of  a  transformation  213 
transformation  (s)  186 
adjoint  of  308 
affine  410,  411,  434 
conjugate  479 

by  the  contravariant  law  146 
by  the  covariant  law  146 
determinant  of  213 
equal  186 

essential  part  of  343 
Hermitian  480 
identity  62,  187,  207 
induced  216 

isometric  273,  324,  325,  330,  413 
law  of  (of  a  scalar  quantity) 
192 

linear  (of  Euclidean  space)  308ff 
linear  (of  linear  spaces)  20711 
linear  (of  a  tensor)  210f 
linear  (in  unitary  spaces)  479 
linear  (of  variables)  56,  57 
nilpotent  229,  230 
nonnegative  self-adjoint  340 
nonsingular  60,  214 
normal  479 

of  coordinates  in  a  change  of  basis 
66 

one-to-one  186 
orthogonal  300 
polynomials  of  208 
product  of  187,  188 
projective  427 
rank  of  213,  214 
self-adjoint  310,  31  Iff,  480 
similarity  218,  226 
of  a  simple  structure  248 
singular  229 
skew  310,  322 
skew-adjoint  310,  322ff 
square  root  of  341 
trace  of  213 
unitary  480 
zero  207 

transformation  groups  186 
transpose  of  a  matrix  59 
transposition  of  a  matrix  59 


iNonx 


495 


trihedral  339 
moving  339 
trilinear  form  170 
trivial  solution  82 
Tyshkevich,  R.  I.  485 


umbral  indices  55 
unimodular  group  196 
unit  of  a  group  180,  181 
unit  circle  278 
unit  matrix  62,  181 
unitary  group  480 
unitary  matrix  481 
unitary  metric  475 
unitary  space  471 
linear  transformation  in  478 
unitary  transformation  480 
units  point  442 


vector  (s)  12,  17 
binormal  338 
components  of  35 
contravariant  142,  146 
contravariant  A-vectors  352 
coordinates  of  35 
covariant  142,  146 
Darboux  339 
difference  between  24 
direction  358 
expansion  35 
first  233 

imaginary-unit  258 


vector(s) 
isotropic  256,  278 
junior  233 
li- vector  366 
last  233 
norm  of  256 
origin  of  70 
orthogonal  255 
principal  normal  338 
senior  233 

system  of  (linearly  dependent)  25 
system  of  (linearly  independent)  26 
tail  of  70 
terminus  70 
•  tip  of  70 
unit  258 

vector  product  304,  370,  371 
vector  space  17 
velocity,  angular  337 
vertices  of  complete  quadrangle  451 
vertices  of  convex  hull  103 
vertices  of  parallelepiped  102 
Vieta’s  theorem  419,  463 
volume  204 
oriented  201,  206 
of  a  parallelepiped  304,  368 


Yudin,  D.  B.  485 


zero  cone  of  a  quadratic  form  137,  139 
zero  subspace  of  a  bilinear  form  135 
zero  subspace  of  a  quadratic  form  138 


TO  THE  READER 


Mir  Publishers  would  be  grateful  for  your 
on  the  content,  translation  and  design  of 
We  would  also  be  pleased  to  receive  any 
gestions  you  may  wish  to  make. 

Our  address  is: 

USSR,  129S20,  Moscow  1-110,  GSP 
Pervy  Rizhsky  Perculok,  2 
MIR  PUBLISHERS 


comments 
this  book, 
other  sug- 


Printed  in  the  Union  of  Soviet  Socialist  Republics 


12.UU 

l,*. 


